Skip to content

Sameen-03/PDF-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

PDF Translator with Local LLM (Ollama + Marker + MarkdownPDF)

This project provides a complete pipeline for translating PDF documents into other languages using a local Large Language Model (LLM). It extracts structured text and images from a PDF, translates the text into the target language using Ollama, and regenerates a new translated PDF — all while preserving layout and formatting.

translator.mov

Features

  • Extracts text and images from PDFs using marker
  • Translates text to any target language using a local LLM via ollama
  • Reconstructs translated content into a well-formatted PDF using markdown-pdf
  • Saves extracted and translated text as .md files for readability and further use
  • Lightweight, offline-friendly, and privacy-preserving

Tech Stack

  • marker: PDF layout-aware text & image extraction
  • ollama: Run LLMs like gemma, mistral, etc. locally
  • markdown-pdf: Convert translated Markdown to a clean, printable PDF
  • Python: File handling, automation, and orchestration

About

An offline PDF translator that extracts text and images, translates them using a local LLM via Ollama, and rebuilds a well-formatted PDF with MarkdownPDF — all while preserving layout and privacy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors