Extract highlighted text from book images using Python, OpenCV, and Tesseract OCR.
Note
I have ported this project to JavScript to run in the browser without any installation. Check out highlights.
See my original blog post Extract highlighted text from a book using Python for more details.
Install dependencies:
pip install pytesseract opencv-python numpyRun with an image:
python main.py path/to/your/book-image.jpg