AI-powered data analyst. Upload any file, get instant insights, charts, and answers β no analyst needed.
DataMind is a full-stack SaaS application that replaces the manual work of a data analyst with AI. You upload a file in any format β CSV, Excel, PDF, or an image β and within seconds you get a live PostgreSQL database, auto-generated visualizations, AI-written insights, and a natural language chatbot that converts plain English into SQL queries.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β React Frontend β
β (Vite + Tailwind + Recharts) β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β HTTP (REST API)
ββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β FastAPI Backend β
ββββββββ¬ββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββ
β β β
ββββββββΌβββββββ βββββββββΌβββββββ βββββββββΌβββββββ
β Layer 1 β β Layer 2 β β Layer 3 β
β Ingestion β β SQL Engine β β Dashboard β
β β β β β + AI Insightsβ
β CSV / Excel β β DataFrame β β β β
β PDF / Image β β PostgreSQL β β Gemini API β
β OCR / Parse β β (Supabase) β β β
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β
ββββββββββββΌβββββββββββ
β PostgreSQL Cloud β
β (Supabase) β
β β
β β’ users β
β β’ uploads β
β β’ query_logs β
β β’ dynamic tables β
βββββββββββββββββββββββ
Accepts any file format and converts it into a clean, structured Pandas DataFrame. Handles messy real-world data including merged cells, multi-page PDFs, and low-resolution scanned images.
- CSV / Excel via Pandas
- PDF tables via pdfplumber
- Images via EasyOCR
- Automatic column name cleaning, type inference, and whitespace stripping
Pushes any DataFrame into a live PostgreSQL database with automatic schema inference. Every upload creates a uniquely named, isolated table. All uploads are logged with UUIDs for full traceability.
- Dynamic table creation via SQLAlchemy
- Correct type inference (TEXT, BIGINT, FLOAT, TIMESTAMP)
- Upload tracking in a
uploadsmetadata table - Schema inspection for LLM context
Reads from the live PostgreSQL table and auto-generates a full visual dashboard β no hardcoding, works for any dataset. AI analysis is powered by Google Gemini.
- Auto bar charts, line charts, correlation heatmaps
- Key metric cards (sum, avg, min, max per numeric column)
- AI-written insights, trends, and business recommendations
- Raw data preview table
A natural language interface where users type plain English questions and get SQL-powered answers back in real time.
- English β SQL conversion via LLM
- Runs queries against the live PostgreSQL table
- Returns results as tables or charts
| Layer | Technology |
|---|---|
| Frontend | React 18, Vite, Tailwind CSS, Recharts |
| Backend | FastAPI, Uvicorn |
| Database | PostgreSQL (Supabase) |
| ORM | SQLAlchemy |
| Data Processing | Pandas, NumPy |
| PDF Parsing | pdfplumber |
| OCR | EasyOCR |
| AI / LLM | Google Gemini API |
| Environment | Python 3.13, venv |
datamind/
βββ backend/
β βββ main.py # FastAPI app entry point
β βββ routes/
β βββ upload.py # File upload endpoint
β βββ dashboard.py # Dashboard data endpoint
βββ frontend/
β βββ src/
β βββ pages/
β β βββ Home.jsx # Landing page
β β βββ Dashboard.jsx # Dashboard page
β βββ components/
β β βββ MetricCard.jsx
β β βββ BarChart.jsx
β β βββ LineChart.jsx
β β βββ DataTable.jsx
β β βββ InsightsCard.jsx
β βββ api/
β βββ client.js # Axios API client
βββ layers/
β βββ layer1_ingestion/
β β βββ ingestion.py # File parsers
β βββ layer2_sql/
β β βββ sql_engine.py # PostgreSQL engine
β βββ layer3_dashboard/
β β βββ insights.py # Gemini AI insights
β βββ layer4_chatbot/ # Coming soon
βββ utils/
β βββ database.py # DB connection test
βββ uploads/ # Temporary file storage
βββ .env # Environment variables (not committed)
βββ requirements.txt
- Python 3.10+
- Node.js 18+
- A Supabase account (free tier)
- A Google AI Studio API key (free tier)
git clone https://github.com/DefinitelyMrityunjay/datamind.git
cd datamindpython3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtCreate a .env file in the root:
DATABASE_URL=your_supabase_postgresql_connection_string
GEMINI_API_KEY=your_google_gemini_api_key
Run this SQL in your Supabase SQL editor:
CREATE TABLE IF NOT EXISTS users (
user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
plan VARCHAR(50) DEFAULT 'free'
);
CREATE TABLE IF NOT EXISTS uploads (
upload_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(user_id),
file_name VARCHAR(255),
file_type VARCHAR(50),
table_name VARCHAR(255),
uploaded_at TIMESTAMP DEFAULT NOW(),
status VARCHAR(50) DEFAULT 'processing'
);
CREATE TABLE IF NOT EXISTS query_logs (
query_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(user_id),
upload_id UUID REFERENCES uploads(upload_id),
english_input TEXT,
generated_sql TEXT,
executed_at TIMESTAMP DEFAULT NOW(),
success BOOLEAN
);cd backend
uvicorn main:app --reloadAPI runs at http://127.0.0.1:8000
API docs at http://127.0.0.1:8000/docs
cd frontend
npm install
npm run devApp runs at http://localhost:5173
- Layer 1 β Universal file ingestion (CSV, Excel, PDF, Image)
- Layer 2 β Dynamic PostgreSQL table generation
- Layer 3 β Auto dashboard with AI insights
- React frontend with FastAPI backend
- Layer 4 β English to SQL chatbot
- User authentication (Supabase Auth)
- Upload history page
- Cloud deployment (Vercel + Railway)
- Multi-sheet Excel support
- Export dashboard as PDF
Mrityunjay β github.com/DefinitelyMrityunjay
Built as a portfolio project to demonstrate end-to-end data engineering, AI integration, and full-stack product development.