A modern offline Android application for fast, accurate text extraction from images and documents. Built with Material Design 3 using Jetpack Compose, this app utilizes the Tesseract OCR engine (via tess-two) for robust multi-language OCR. Features include a CameraX-powered scanner, rich Material theming, Room database for local storage, and a clean MVVM architecture with Hilt dependency injection.
Features
-
Scan and extract text from photos and documents with Tesseract OCR
-
Offline text recognition (no internet needed, works anywhere)
-
Material Design 3 UI using Jetpack Compose and dynamic color theming
-
Capture images with CameraX and process them for enhanced OCR accuracy
-
Save and manage extracted text locally using Room database
-
Clean and testable codebase following MVVM and Clean Architecture principles
-
Kotlin Coroutines for smooth, efficient background image processing
-
Supports multiple languages via downloadable Tesseract trained data
-
Easy setup and extensible for custom OCR workflows
For building an offline Material Design 3 OCR app using Tesseract on Android, here's the comprehensive technology stack you should use:
Jetpack Compose with Material 3[1][2][3]
- Use
androidx.compose.material3for Material Design 3 components - Implement Material You with dynamic color theming[1]
- Dependency:
implementation("androidx.compose.material3:material3:1.3.2")[3]
Material 3 Components to use:
Scaffoldfor layout structure[4][5]TopAppBarfor app navigation[6][5]Button,Card,TextFieldvariants (Filled, Elevated, FilledTonal)[3]- Material color schemes and typography from
MaterialTheme.colorScheme[7][8]
Tesseract OCR via tess-two library[9][10][11]
- Add dependency:
implementation 'com.rmtheis:tess-two:9.1.0'[9] - Provides Java API wrapper for native Tesseract 3.05 and Leptonica 1.74.1[9]
- Completely offline OCR processing[12][13]
- Supports 100+ languages with trained data files[14][13]
Implementation approach:
import com.googlecode.tesseract.android.TessBaseAPI
private fun extractText(bitmap: Bitmap): String {
val tessBaseApi = TessBaseAPI()
tessBaseApi.init(DATA_PATH, "eng")
tessBaseApi.setImage(bitmap)
val extractedText = tessBaseApi.getUTF8Text()
tessBaseApi.end()
return extractedText
}MVVM with Clean Architecture[15][16][17]
Layer structure:
- Presentation Layer: Composable UI + ViewModel[16][15]
- Domain Layer: Use cases for OCR processing logic[17][16]
- Data Layer: Repository pattern + Room database + Tesseract wrapper[15][16]
This separation ensures:[16]
- Testable business logic independent of UI
- Decoupled code components
- Easy maintainability and feature additions
CameraX Library[18][19][20]
- Modern Jetpack camera API with backward compatibility to Android 5.0[21]
- Dependencies:
implementation("androidx.camera:camera-core:1.6.0-alpha01")
implementation("androidx.camera:camera-camera2:1.6.0-alpha01")
implementation("androidx.camera:camera-lifecycle:1.6.0-alpha01")
implementation("androidx.camera:camera-view:1.6.0-alpha01")Use cases to implement:
- Preview for viewfinder[19][22]
- ImageCapture for taking photos[18][19]
- ImageAnalysis for real-time OCR (optional)[23]
Room Database[24][25][26]
- Jetpack's recommended offline storage solution[27][25]
- Abstraction over SQLite with compile-time SQL verification[25]
- Dependencies:
implementation("androidx.room:room-runtime:2.6.1")
ksp("androidx.room:room-compiler:2.6.1")
implementation("androidx.room:room-ktx:2.6.1")Store:
- Scanned document metadata
- Extracted text for offline access
- User preferences and settings
Dagger Hilt[28][29][30]
- Jetpack's recommended DI library[29]
- Reduces boilerplate compared to manual DI[28]
- Built on Dagger for compile-time correctness[28]
Dependencies:
// Project build.gradle
id("com.google.dagger.hilt.android") version "2.57.1" apply false
// App build.gradle
implementation("com.google.dagger:hilt-android:2.57.1")
ksp("com.google.dagger:hilt-android-compiler:2.57.1")Annotate:
- Application class with
@HiltAndroidApp[64][28] - Activities/ViewModels with
@AndroidEntryPoint[58] - Modules with
@Moduleand@InstallIn[58]
Kotlin Coroutines[31][32][33]
- For background OCR processing without blocking UI
- Integration with Room and ViewModel
- Use appropriate dispatchers:
Dispatchers.IOfor image processing and OCR[32]Dispatchers.Mainfor UI updatesDispatchers.Defaultfor CPU-intensive transformations
Flow for reactive data:
fun processOcrFlow() = flow<State<Result>> {
emit(State.loading())
val localData = fetchFromLocal().first()
emit(State.success(localData))
}Image preprocessing for better OCR accuracy:
- Grayscale conversion[14]
- Thresholding (binary conversion)[14]
- Noise removal[14]
- Deskewing (rotation correction)[14]
Libraries to consider:
- OpenCV4Android for advanced preprocessing[34]
- Built-in Android Bitmap transformations for basic operations
app/
├── data/
│ ├── local/
│ │ ├── dao/ (Room DAOs)
│ │ └── entity/ (Room entities)
│ └── repository/ (Implementation)
├── domain/
│ ├── model/ (Domain models)
│ ├── repository/ (Repository interfaces)
│ └── usecase/ (Business logic)
├── presentation/
│ ├── ui/
│ │ ├── camera/ (Camera screen)
│ │ ├── results/ (OCR results screen)
│ │ └── theme/ (Material 3 theme)
│ └── viewmodel/
└── di/ (Hilt modules)
Trained Data Management:
- Download Tesseract trained data files (tessdata) for required languages[13][9]
- Store in app's private directory:
{DATA_PATH}/tessdata/[24][7] - Languages available at: https://github.com/tesseract-ocr/tessdata[13]
Image Quality Optimization:
- Implement proper lighting detection
- Add manual focus with CameraX tap-to-focus[35]
- Provide image quality feedback to users[23]
Performance Considerations:
- Process OCR on background threads only[32]
- Cache processed results in Room[36][33]
- Implement proper memory management for Bitmap objects
- Use appropriate image compression before OCR processing
// App build.gradle.kts
plugins {
id("com.android.application")
id("org.jetbrains.kotlin.android")
id("com.google.devtools.ksp")
id("com.google.dagger.hilt.android")
}
dependencies {
// Compose + Material 3
implementation("androidx.compose.material3:material3:1.3.2")
implementation("androidx.compose.ui:ui:1.7.6")
implementation("androidx.activity:activity-compose:1.9.3")
// Tesseract OCR
implementation("com.rmtheis:tess-two:9.1.0")
// CameraX
implementation("androidx.camera:camera-core:1.6.0-alpha01")
implementation("androidx.camera:camera-camera2:1.6.0-alpha01")
implementation("androidx.camera:camera-lifecycle:1.6.0-alpha01")
implementation("androidx.camera:camera-view:1.6.0-alpha01")
// Room
implementation("androidx.room:room-runtime:2.6.1")
implementation("androidx.room:room-ktx:2.6.1")
ksp("androidx.room:room-compiler:2.6.1")
// Hilt
implementation("com.google.dagger:hilt-android:2.57.1")
ksp("com.google.dagger:hilt-android-compiler:2.57.1")
implementation("androidx.hilt:hilt-navigation-compose:1.2.0")
// Coroutines
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.8.1")
// ViewModel
implementation("androidx.lifecycle:lifecycle-viewmodel-compose:2.8.7")
}This stack provides a robust, modern, and completely offline Android OCR application with Material Design 3 aesthetics, clean architecture, and excellent performance.[34][23][3][16][9]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75