An easy‑to‑use OCR service for handwritten Hindi text. Built with FastAPI, OpenCV, and TensorFlow/PyTorch, it detects word regions, extracts text, and classifies snippets.
- Word Detection: Highlights words in uploaded images
- Text Extraction: Uses your custom OCR script to extract Hindi text (Devanagari)
- Model Prediction: Classifies text snippets via your
.pthor.kerasmodel - Safe Temp‑File Handling: Works on Windows & Linux
hindi-ocr/
├── app.py # FastAPI application
├── requirements.txt # Python dependencies
├── models/ # Model weight files
│ │───notebooks/ # Jupyter notebooks for trained model
│ │ │── handwritten[pytorch].ipynb # model train using pytorch
│ │ └── handwritten[tensorflow].ipynb # model train using tensorflow
│ ├── hindi_ocr_model.pth # PyTorch model
│ └── hindi_ocr_model.keras # (optional TF fallback)
├── fonts/ # Font files
│ └── NotoSansDevanagari-Regular.ttf
├── dataset/ # Test data
│ ├── images/ # Input images for OCR
│ │ └── training images # sample handwritten Hindi image
│ └── words/ # Expected‑output text files
│ └── output labels # sample transcription
├── label_encoder.pkl # sklearn LabelEncoder for class decoding
│── Sample_OCR_Image.png # example image of OCR performace
└── README.md # Project documentation
- Clone repo
git clone https://github.com/Stu-ops/hindi-ocr.git
cd hindi-ocr - Virtual environment
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows - Install dependencies
pip install -r requirements.txt
pip install python-multipart # for file uploads
Ensure the following files/folders sit next to app.py:
models/(your.kerasand optional.pthweights)label_encoder.pkl(sklearn LabelEncoder)fonts/NotoSansDevanagari-Regular.ttfdataset/(training and testing images, words for testing)
uvicorn app:app --reload --host 0.0.0.0 --port 8000
Swagger UI: /docs
ReDoc: /redoc
| Method | Path | Description |
|---|---|---|
| GET | / | Welcome HTML page |
| POST | /process/ | Upload image → returns OCR & prediction |
| GET | /word-detection/ | Returns word‑boxed image |
| GET | /prediction/ | Returns prediction‑overlay image |
{
"OCR_output": "यह एक उदाहरण है",
"word_count": 5,
"prediction_label": "अ"
}
curl -X POST "http://localhost:8000/process/" \
-H "Content-Type: multipart/form-data" \
-F "file=@dataset/example1.png"import requests
url = "http://localhost:8000/process/" with open("dataset/example1.png","rb") as f: files = {"file": f} resp = requests.post(url, files=files) print(resp.json())
Retrieve word‑detection image:
curl http://localhost:8000/word-detection/ --output words.png- “Form data requires python-multipart” →
pip install python-multipart - PermissionError on Windows → ensure you close image files before deletion; use
with Image.open(...) - Missing glyph warnings → add fallback font:
plt.rcParams['font.family'] = ['Noto Sans Devanagari', 'DejaVu Sans']
MIT © Stu-ops