Menu

OCR reading with Python

6th September 2024 - AI, c#, python, Uncategorised
OCR reading with Python

The Google Tesseract project is an OCR application and library created as part of their digitising effort to perform optical character reading.

A Windows version / executable is available here. It has multiple languages and scripts, and tops out at around 870MB.

from PIL import Image
import pytesseract
#add your pytesseract path here if needed
img=Image.open("example.png")
print(pytesseract.image_to_string(img))

Visual Studio integration

There are several wrappers for tesseract for visual studio, and require the models to be available to the project.