OCR reading with Python

The Google Tesseract project is an OCR application and library created as part of their digitising effort to perform optical character reading.

A Windows version / executable is available here. It has multiple languages and scripts, and tops out at around 870MB.

from PIL import Image
import pytesseract
#add your pytesseract path here if needed
img=Image.open("example.png")
print(pytesseract.image_to_string(img))

Visual Studio integration

There are several wrappers for tesseract for visual studio, and require the models to be available to the project.

Tags: ai, google, machine-learning, ocr, optical character reading, tesseract

Musings

The blog of Neil Highley, C# developer, Automation Engineer, IOT Tinkerer, Robot fan

OCR reading with Python