Skip to content

Optical Character Recognition

Abstract

Optical Character Recognition is a Python project that uses OCR to recognize text in images. The application features image processing, text extraction, and a CLI interface, demonstrating best practices in computer vision and automation.

Prerequisites

  • Python 3.8 or above
  • A code editor or IDE
  • Basic understanding of OCR and computer vision
  • Required libraries: pytesseractpytesseract, opencv-pythonopencv-python, numpynumpy

Before you Start

Install Python and the required libraries:

Install dependencies
pip install pytesseract opencv-python numpy
Install dependencies
pip install pytesseract opencv-python numpy

Getting Started

Create a Project

  1. Create a folder named optical-character-recognitionoptical-character-recognition.
  2. Open the folder in your code editor or IDE.
  3. Create a file named optical_character_recognition.pyoptical_character_recognition.py.
  4. Copy the code below into your file.

Write the Code

⚙️ Optical Character Recognition
Optical Character Recognition
import cv2
import pytesseract
import numpy as np
 
class OpticalCharacterRecognition:
    def __init__(self):
        pass
 
    def recognize_text(self, image):
        text = pytesseract.image_to_string(image)
        print(f"Recognized text: {text}")
        return text
 
    def demo(self):
        img = np.zeros((100, 300, 3), dtype=np.uint8)
        cv2.putText(img, 'Python OCR', (5, 70), cv2.FONT_HERSHEY_SIMPLEX, 2, (255,255,255), 3)
        self.recognize_text(img)
        cv2.imshow('OCR Demo', img)
        cv2.waitKey(1000)
        cv2.destroyAllWindows()
 
if __name__ == "__main__":
    print("Optical Character Recognition Demo")
    ocr = OpticalCharacterRecognition()
    ocr.demo()
 
Optical Character Recognition
import cv2
import pytesseract
import numpy as np
 
class OpticalCharacterRecognition:
    def __init__(self):
        pass
 
    def recognize_text(self, image):
        text = pytesseract.image_to_string(image)
        print(f"Recognized text: {text}")
        return text
 
    def demo(self):
        img = np.zeros((100, 300, 3), dtype=np.uint8)
        cv2.putText(img, 'Python OCR', (5, 70), cv2.FONT_HERSHEY_SIMPLEX, 2, (255,255,255), 3)
        self.recognize_text(img)
        cv2.imshow('OCR Demo', img)
        cv2.waitKey(1000)
        cv2.destroyAllWindows()
 
if __name__ == "__main__":
    print("Optical Character Recognition Demo")
    ocr = OpticalCharacterRecognition()
    ocr.demo()
 

Example Usage

Run OCR
python optical_character_recognition.py
Run OCR
python optical_character_recognition.py

Explanation

Key Features

  • OCR: Recognizes text in images.
  • Image Processing: Prepares images for text extraction.
  • Error Handling: Validates inputs and manages exceptions.
  • CLI Interface: Interactive command-line usage.

Code Breakdown

  1. Import Libraries and Setup OCR
optical_character_recognition.py
import pytesseract
import cv2
import numpy as np
optical_character_recognition.py
import pytesseract
import cv2
import numpy as np
  1. Image Processing and Text Extraction Functions
optical_character_recognition.py
def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return gray
 
def extract_text(image):
    text = pytesseract.image_to_string(image)
    return text
optical_character_recognition.py
def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return gray
 
def extract_text(image):
    text = pytesseract.image_to_string(image)
    return text
  1. CLI Interface and Error Handling
optical_character_recognition.py
def main():
    print("Optical Character Recognition")
    # image = cv2.imread('text_image.jpg')
    # processed = preprocess_image(image)
    # text = extract_text(processed)
    print("[Demo] OCR logic here.")
 
if __name__ == "__main__":
    main()
optical_character_recognition.py
def main():
    print("Optical Character Recognition")
    # image = cv2.imread('text_image.jpg')
    # processed = preprocess_image(image)
    # text = extract_text(processed)
    print("[Demo] OCR logic here.")
 
if __name__ == "__main__":
    main()

Features

  • OCR: Text recognition and image processing
  • Modular Design: Separate functions for each task
  • Error Handling: Manages invalid inputs and exceptions
  • Production-Ready: Scalable and maintainable code

Next Steps

Enhance the project by:

  • Integrating with real image datasets
  • Supporting advanced OCR algorithms
  • Creating a GUI for OCR
  • Adding real-time recognition
  • Unit testing for reliability

Educational Value

This project teaches:

  • Computer Vision: OCR and image processing
  • Software Design: Modular, maintainable code
  • Error Handling: Writing robust Python code

Real-World Applications

  • Document Digitization
  • Accessibility Tools
  • AI Platforms

Conclusion

Optical Character Recognition demonstrates how to build a scalable and accurate OCR tool using Python. With modular design and extensibility, this project can be adapted for real-world applications in digitization, accessibility, and more. For more advanced projects, visit Python Central Hub.

Was this page helpful?

Let us know how we did