Skip to content

Yuguda999/receipt_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Receipt Classifier

A machine learning system that classifies receipts as either real or AI-generated using a hybrid image-text approach.

Overview

This project implements a receipt classification system that combines computer vision and natural language processing techniques to determine if a receipt image is authentic or AI-generated. The system uses a hybrid model architecture that processes both the visual features of the receipt image and the text extracted from it.

Features

  • Hybrid Classification: Combines image features from EfficientNet-B0 and text features from BERT
  • OCR Integration: Uses TrOCR (Transformer OCR) for high-quality text extraction from receipt images
  • REST API: Provides a FastAPI endpoint for easy integration with other systems
  • Multi-format Support: Handles both image files (JPG, PNG) and PDF documents

Model Architecture

The system uses a three-part architecture:

  1. Image Feature Extraction: EfficientNet-B0 processes the visual aspects of the receipt
  2. Text Extraction & Processing: TrOCR extracts text from the image, which is then processed by BERT
  3. Hybrid Classifier: Combines image and text features to make the final classification

Installation

Prerequisites

  • Python 3.8+
  • Poppler (for PDF processing)

Setup

  1. Clone the repository:

    git clone https://github.com/yuguda999/receipt_classifier.git
    cd receipt_classifier
    
  2. Create a virtual environment and activate it:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Install Poppler (required for PDF processing):

    • On Ubuntu/Debian: sudo apt-get install poppler-utils
    • On macOS: brew install poppler
    • On Windows: Download from poppler releases

Usage

Running the API Server

Start the FastAPI server:

uvicorn app:app --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000.

API Endpoints

POST /predict

Accepts an image or PDF file and returns the classification result.

Request:

  • Form data with a file field named file

Response:

{
  "prediction": "real",
  "probabilities": {
    "ai_generated": 0.05,
    "real": 0.95
  },
  "extracted_text": "Sample receipt text extracted from the image..."
}

Using the Model in Python

import requests

url = "http://localhost:8000/predict"
files = {"file": open("path/to/receipt.jpg", "rb")}
response = requests.post(url, files=files)
result = response.json()
print(f"Prediction: {result['prediction']}")
print(f"Confidence: {result['probabilities'][result['prediction']]:.2f}")

Training

The model was trained on a dataset of real and AI-generated receipts. The training process is documented in the receipt_classifier.ipynb notebook.

Dataset Structure

receipt_dataset/
├── ai_generated/  # AI-generated receipt images
└── real/          # Real receipt images

Performance

The hybrid model achieves high accuracy in distinguishing between real and AI-generated receipts, with validation accuracy reaching 100% after training.

License

MIT License

Acknowledgements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors