Skip to content

musabadru/ocr-android-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

109 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Material Expressive Android OCR App

A modern offline Android application for fast, accurate text extraction from images and documents. Built with Material Design 3 using Jetpack Compose, this app utilizes the Tesseract OCR engine (via tess-two) for robust multi-language OCR. Features include a CameraX-powered scanner, rich Material theming, Room database for local storage, and a clean MVVM architecture with Hilt dependency injection.

Features

  • Scan and extract text from photos and documents with Tesseract OCR

  • Offline text recognition (no internet needed, works anywhere)

  • Material Design 3 UI using Jetpack Compose and dynamic color theming

  • Capture images with CameraX and process them for enhanced OCR accuracy

  • Save and manage extracted text locally using Room database

  • Clean and testable codebase following MVVM and Clean Architecture principles

  • Kotlin Coroutines for smooth, efficient background image processing

  • Supports multiple languages via downloadable Tesseract trained data

  • Easy setup and extensible for custom OCR workflows

Recommended Stack for Material Expressive Android OCR App with Tesseract

For building an offline Material Design 3 OCR app using Tesseract on Android, here's the comprehensive technology stack you should use:

UI Framework & Design

Jetpack Compose with Material 3[1][2][3]

  • Use androidx.compose.material3 for Material Design 3 components
  • Implement Material You with dynamic color theming[1]
  • Dependency: implementation("androidx.compose.material3:material3:1.3.2")[3]

Material 3 Components to use:

  • Scaffold for layout structure[4][5]
  • TopAppBar for app navigation[6][5]
  • Button, Card, TextField variants (Filled, Elevated, FilledTonal)[3]
  • Material color schemes and typography from MaterialTheme.colorScheme[7][8]

OCR Engine

Tesseract OCR via tess-two library[9][10][11]

  • Add dependency: implementation 'com.rmtheis:tess-two:9.1.0'[9]
  • Provides Java API wrapper for native Tesseract 3.05 and Leptonica 1.74.1[9]
  • Completely offline OCR processing[12][13]
  • Supports 100+ languages with trained data files[14][13]

Implementation approach:

import com.googlecode.tesseract.android.TessBaseAPI

private fun extractText(bitmap: Bitmap): String {
    val tessBaseApi = TessBaseAPI()
    tessBaseApi.init(DATA_PATH, "eng")
    tessBaseApi.setImage(bitmap)
    val extractedText = tessBaseApi.getUTF8Text()
    tessBaseApi.end()
    return extractedText
}

Architecture Pattern

MVVM with Clean Architecture[15][16][17]

Layer structure:

  • Presentation Layer: Composable UI + ViewModel[16][15]
  • Domain Layer: Use cases for OCR processing logic[17][16]
  • Data Layer: Repository pattern + Room database + Tesseract wrapper[15][16]

This separation ensures:[16]

  • Testable business logic independent of UI
  • Decoupled code components
  • Easy maintainability and feature additions

Camera Integration

CameraX Library[18][19][20]

  • Modern Jetpack camera API with backward compatibility to Android 5.0[21]
  • Dependencies:
implementation("androidx.camera:camera-core:1.6.0-alpha01")
implementation("androidx.camera:camera-camera2:1.6.0-alpha01")
implementation("androidx.camera:camera-lifecycle:1.6.0-alpha01")
implementation("androidx.camera:camera-view:1.6.0-alpha01")

Use cases to implement:

  • Preview for viewfinder[19][22]
  • ImageCapture for taking photos[18][19]
  • ImageAnalysis for real-time OCR (optional)[23]

Local Data Storage

Room Database[24][25][26]

  • Jetpack's recommended offline storage solution[27][25]
  • Abstraction over SQLite with compile-time SQL verification[25]
  • Dependencies:
implementation("androidx.room:room-runtime:2.6.1")
ksp("androidx.room:room-compiler:2.6.1")
implementation("androidx.room:room-ktx:2.6.1")

Store:

  • Scanned document metadata
  • Extracted text for offline access
  • User preferences and settings

Dependency Injection

Dagger Hilt[28][29][30]

  • Jetpack's recommended DI library[29]
  • Reduces boilerplate compared to manual DI[28]
  • Built on Dagger for compile-time correctness[28]

Dependencies:

// Project build.gradle
id("com.google.dagger.hilt.android") version "2.57.1" apply false

// App build.gradle
implementation("com.google.dagger:hilt-android:2.57.1")
ksp("com.google.dagger:hilt-android-compiler:2.57.1")

Annotate:

  • Application class with @HiltAndroidApp[64][28]
  • Activities/ViewModels with @AndroidEntryPoint[58]
  • Modules with @Module and @InstallIn[58]

Asynchronous Processing

Kotlin Coroutines[31][32][33]

  • For background OCR processing without blocking UI
  • Integration with Room and ViewModel
  • Use appropriate dispatchers:
    • Dispatchers.IO for image processing and OCR[32]
    • Dispatchers.Main for UI updates
    • Dispatchers.Default for CPU-intensive transformations

Flow for reactive data:

fun processOcrFlow() = flow<State<Result>> {
    emit(State.loading())
    val localData = fetchFromLocal().first()
    emit(State.success(localData))
}

Image Processing

Image preprocessing for better OCR accuracy:

  • Grayscale conversion[14]
  • Thresholding (binary conversion)[14]
  • Noise removal[14]
  • Deskewing (rotation correction)[14]

Libraries to consider:

  • OpenCV4Android for advanced preprocessing[34]
  • Built-in Android Bitmap transformations for basic operations

Project Structure

app/
├── data/
│   ├── local/
│   │   ├── dao/ (Room DAOs)
│   │   └── entity/ (Room entities)
│   └── repository/ (Implementation)
├── domain/
│   ├── model/ (Domain models)
│   ├── repository/ (Repository interfaces)
│   └── usecase/ (Business logic)
├── presentation/
│   ├── ui/
│   │   ├── camera/ (Camera screen)
│   │   ├── results/ (OCR results screen)
│   │   └── theme/ (Material 3 theme)
│   └── viewmodel/
└── di/ (Hilt modules)

Additional Recommendations

Trained Data Management:

Image Quality Optimization:

  • Implement proper lighting detection
  • Add manual focus with CameraX tap-to-focus[35]
  • Provide image quality feedback to users[23]

Performance Considerations:

  • Process OCR on background threads only[32]
  • Cache processed results in Room[36][33]
  • Implement proper memory management for Bitmap objects
  • Use appropriate image compression before OCR processing

Complete Dependency Setup

// App build.gradle.kts
plugins {
    id("com.android.application")
    id("org.jetbrains.kotlin.android")
    id("com.google.devtools.ksp")
    id("com.google.dagger.hilt.android")
}

dependencies {
    // Compose + Material 3
    implementation("androidx.compose.material3:material3:1.3.2")
    implementation("androidx.compose.ui:ui:1.7.6")
    implementation("androidx.activity:activity-compose:1.9.3")
    
    // Tesseract OCR
    implementation("com.rmtheis:tess-two:9.1.0")
    
    // CameraX
    implementation("androidx.camera:camera-core:1.6.0-alpha01")
    implementation("androidx.camera:camera-camera2:1.6.0-alpha01")
    implementation("androidx.camera:camera-lifecycle:1.6.0-alpha01")
    implementation("androidx.camera:camera-view:1.6.0-alpha01")
    
    // Room
    implementation("androidx.room:room-runtime:2.6.1")
    implementation("androidx.room:room-ktx:2.6.1")
    ksp("androidx.room:room-compiler:2.6.1")
    
    // Hilt
    implementation("com.google.dagger:hilt-android:2.57.1")
    ksp("com.google.dagger:hilt-android-compiler:2.57.1")
    implementation("androidx.hilt:hilt-navigation-compose:1.2.0")
    
    // Coroutines
    implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.8.1")
    
    // ViewModel
    implementation("androidx.lifecycle:lifecycle-viewmodel-compose:2.8.7")
}

This stack provides a robust, modern, and completely offline Android OCR application with Material Design 3 aesthetics, clean architecture, and excellent performance.[34][23][3][16][9]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

About

A modern offline Android application for fast, accurate text extraction from images and documents. Built with Material Design 3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors