Math-Parser

A lightweight modular math OCR pipeline that processes scanned equations and reconstructs them into LaTeX.

Modules

Symbol Classifier (CNN-based)
Symbol Segmentation

Datasets

For symbol classification training, we use the HASYv2 dataset.

Setup

git clone https://github.com/yourusername/math-ocr.git
cd math-ocr
pip install -r requirements.txt

Folder Structure

data/: Store your raw, processed, and synthetic images
models/: Trained model checkpoints
src/: All source code modules

To download HASYv2 dataset:

import kagglehub

# Download latest version
path = kagglehub.dataset_download("guru001/hasyv2")

print("Path to dataset files:", path)

Then move files as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
models		models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Math-Parser

Modules

Datasets

Setup

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

tau315/Math-Parser

Folders and files

Latest commit

History

Repository files navigation

Math-Parser

Modules

Datasets

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages