Definition
Optical Character Recognition (OCR) is the process of converting printed or handwritten text in images into machine-readable digital text.
Purpose
The purpose is to digitize documents for search, editing, and analysis. OCR supports applications in digitization, accessibility, and data entry automation.
Importance
- Enables conversion of paper to searchable text.
- Improves efficiency in industries like banking and healthcare.
- Struggles with poor-quality scans or unusual fonts.
- Forms the basis for text mining in scanned archives.
How It Works
- Scan or capture image of text.
- Preprocess image to remove noise.
- Detect and segment characters or words.
- Recognize text using ML models.
- Output editable digital text.
Examples (Real World)
- Google Cloud Vision OCR: text recognition service.
- ABBYY FineReader: commercial OCR software.
- Project Gutenberg digitization: OCR for books.
References / Further Reading
- Smith, R. “An Overview of the Tesseract OCR Engine.” ICDAR.
- ISO/IEC 15938-4: Multimedia Content Description Interface.
- IEEE Transactions on Pattern Analysis and Machine Intelligence.