Skip to content

DefinitelyMrityunjay/datamind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

datamind.

AI-powered data analyst. Upload any file, get instant insights, charts, and answers β€” no analyst needed.


What is DataMind?

DataMind is a full-stack SaaS application that replaces the manual work of a data analyst with AI. You upload a file in any format β€” CSV, Excel, PDF, or an image β€” and within seconds you get a live PostgreSQL database, auto-generated visualizations, AI-written insights, and a natural language chatbot that converts plain English into SQL queries.


Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     React Frontend                       β”‚
β”‚              (Vite + Tailwind + Recharts)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ HTTP (REST API)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   FastAPI Backend                        β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                 β”‚                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚   Layer 1   β”‚  β”‚   Layer 2    β”‚  β”‚   Layer 3    β”‚
β”‚  Ingestion  β”‚  β”‚  SQL Engine  β”‚  β”‚  Dashboard   β”‚
β”‚             β”‚  β”‚              β”‚  β”‚  + AI Insightsβ”‚
β”‚ CSV / Excel β”‚  β”‚  DataFrame β†’ β”‚  β”‚              β”‚
β”‚ PDF / Image β”‚  β”‚  PostgreSQL  β”‚  β”‚  Gemini API  β”‚
β”‚ OCR / Parse β”‚  β”‚  (Supabase)  β”‚  β”‚              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  PostgreSQL Cloud   β”‚
              β”‚     (Supabase)      β”‚
              β”‚                     β”‚
              β”‚  β€’ users            β”‚
              β”‚  β€’ uploads          β”‚
              β”‚  β€’ query_logs       β”‚
              β”‚  β€’ dynamic tables   β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Features

Layer 1 β€” Universal Data Ingestion

Accepts any file format and converts it into a clean, structured Pandas DataFrame. Handles messy real-world data including merged cells, multi-page PDFs, and low-resolution scanned images.

  • CSV / Excel via Pandas
  • PDF tables via pdfplumber
  • Images via EasyOCR
  • Automatic column name cleaning, type inference, and whitespace stripping

Layer 2 β€” Dynamic SQL Engine

Pushes any DataFrame into a live PostgreSQL database with automatic schema inference. Every upload creates a uniquely named, isolated table. All uploads are logged with UUIDs for full traceability.

  • Dynamic table creation via SQLAlchemy
  • Correct type inference (TEXT, BIGINT, FLOAT, TIMESTAMP)
  • Upload tracking in a uploads metadata table
  • Schema inspection for LLM context

Layer 3 β€” Auto Dashboard & AI Insights

Reads from the live PostgreSQL table and auto-generates a full visual dashboard β€” no hardcoding, works for any dataset. AI analysis is powered by Google Gemini.

  • Auto bar charts, line charts, correlation heatmaps
  • Key metric cards (sum, avg, min, max per numeric column)
  • AI-written insights, trends, and business recommendations
  • Raw data preview table

Layer 4 β€” English to SQL Chatbot (in progress)

A natural language interface where users type plain English questions and get SQL-powered answers back in real time.

  • English β†’ SQL conversion via LLM
  • Runs queries against the live PostgreSQL table
  • Returns results as tables or charts

Tech Stack

Layer Technology
Frontend React 18, Vite, Tailwind CSS, Recharts
Backend FastAPI, Uvicorn
Database PostgreSQL (Supabase)
ORM SQLAlchemy
Data Processing Pandas, NumPy
PDF Parsing pdfplumber
OCR EasyOCR
AI / LLM Google Gemini API
Environment Python 3.13, venv

Project Structure

datamind/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                  # FastAPI app entry point
β”‚   └── routes/
β”‚       β”œβ”€β”€ upload.py            # File upload endpoint
β”‚       └── dashboard.py         # Dashboard data endpoint
β”œβ”€β”€ frontend/
β”‚   └── src/
β”‚       β”œβ”€β”€ pages/
β”‚       β”‚   β”œβ”€β”€ Home.jsx         # Landing page
β”‚       β”‚   └── Dashboard.jsx    # Dashboard page
β”‚       β”œβ”€β”€ components/
β”‚       β”‚   β”œβ”€β”€ MetricCard.jsx
β”‚       β”‚   β”œβ”€β”€ BarChart.jsx
β”‚       β”‚   β”œβ”€β”€ LineChart.jsx
β”‚       β”‚   β”œβ”€β”€ DataTable.jsx
β”‚       β”‚   └── InsightsCard.jsx
β”‚       └── api/
β”‚           └── client.js        # Axios API client
β”œβ”€β”€ layers/
β”‚   β”œβ”€β”€ layer1_ingestion/
β”‚   β”‚   └── ingestion.py         # File parsers
β”‚   β”œβ”€β”€ layer2_sql/
β”‚   β”‚   └── sql_engine.py        # PostgreSQL engine
β”‚   β”œβ”€β”€ layer3_dashboard/
β”‚   β”‚   └── insights.py          # Gemini AI insights
β”‚   └── layer4_chatbot/          # Coming soon
β”œβ”€β”€ utils/
β”‚   └── database.py              # DB connection test
β”œβ”€β”€ uploads/                     # Temporary file storage
β”œβ”€β”€ .env                         # Environment variables (not committed)
└── requirements.txt

Getting Started

Prerequisites

1. Clone the repo

git clone https://github.com/DefinitelyMrityunjay/datamind.git
cd datamind

2. Set up the backend

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

3. Configure environment variables

Create a .env file in the root:

DATABASE_URL=your_supabase_postgresql_connection_string
GEMINI_API_KEY=your_google_gemini_api_key

4. Set up the database

Run this SQL in your Supabase SQL editor:

CREATE TABLE IF NOT EXISTS users (
    user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    plan VARCHAR(50) DEFAULT 'free'
);

CREATE TABLE IF NOT EXISTS uploads (
    upload_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(user_id),
    file_name VARCHAR(255),
    file_type VARCHAR(50),
    table_name VARCHAR(255),
    uploaded_at TIMESTAMP DEFAULT NOW(),
    status VARCHAR(50) DEFAULT 'processing'
);

CREATE TABLE IF NOT EXISTS query_logs (
    query_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(user_id),
    upload_id UUID REFERENCES uploads(upload_id),
    english_input TEXT,
    generated_sql TEXT,
    executed_at TIMESTAMP DEFAULT NOW(),
    success BOOLEAN
);

5. Start the backend

cd backend
uvicorn main:app --reload

API runs at http://127.0.0.1:8000 API docs at http://127.0.0.1:8000/docs

6. Start the frontend

cd frontend
npm install
npm run dev

App runs at http://localhost:5173


Screenshots

image image image image

Roadmap

  • Layer 1 β€” Universal file ingestion (CSV, Excel, PDF, Image)
  • Layer 2 β€” Dynamic PostgreSQL table generation
  • Layer 3 β€” Auto dashboard with AI insights
  • React frontend with FastAPI backend
  • Layer 4 β€” English to SQL chatbot
  • User authentication (Supabase Auth)
  • Upload history page
  • Cloud deployment (Vercel + Railway)
  • Multi-sheet Excel support
  • Export dashboard as PDF

Author

Mrityunjay β€” github.com/DefinitelyMrityunjay


Built as a portfolio project to demonstrate end-to-end data engineering, AI integration, and full-stack product development.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors