Deep Learning Project 1

Building an OCR Form Data Extractor: From Zero to Hero

Extract text from handwritten forms automatically using Python, EasyOCR, and OpenCV


๐Ÿ“Œ Project Overview

Ever wondered how apps like Google Lens extract text from images? In this tutorial, we’ll build a Smart Form Data Extractor that can read printed AND handwritten text from form imagesโ€”perfect for digitizing student records, surveys, or any paper forms!

What You’ll Build:

  • Single image processing mode
  • Batch processing for multiple forms
  • Multi-format output (Text, JSON, Excel)
  • Bilingual support (English + Tamil)
  • Smart field detection (phone numbers, dates, pincodes)

Time Required: 1.5 – 2 hours
Difficulty: Beginner-Friendly
Cost: 100% Free (using Google Colab)


๐ŸŽฏ What We’re Solving

The Problem: Training institutes, schools, and businesses receive hundreds of paper forms daily. Manual data entry is:

  • โฐ Time-consuming (5-10 minutes per form)
  • ๐Ÿ˜ซ Tedious and boring
  • โŒ Error-prone (typos, missed fields)
  • ๐Ÿ’ฐ Expensive (hiring data entry staff)

Our Solution: An automated OCR system that:

  • โœ… Processes forms in 30-60 seconds
  • โœ… Extracts ALL text automatically
  • โœ… Outputs organized data in multiple formats
  • โœ… Handles both printed and handwritten text

๐Ÿ› ๏ธ Tech Stack

TechnologyPurposeWhy This One?
PythonProgramming languageIndustry standard for AI/ML
EasyOCRText recognition engineBest for handwriting, supports 80+ languages
OpenCVImage processingImproves image quality for better OCR
PandasData manipulationEasy Excel/CSV export
Google ColabDevelopment environmentFree, no setup required, includes GPU

No Installation Required! Everything runs in your browser via Google Colab.


๐Ÿ“š Understanding the Fundamentals

Before diving into code, let’s understand key concepts:

What is OCR (Optical Character Recognition)?

Simple Definition: Converting images of text into actual, editable text.

Real-World Analogy: Imagine you take a photo of a book page. Your eyes can READ the text, but your phone only sees it as a picture. OCR is like teaching your computer to “read” text from images just like you do!

How OCR Works (3 Steps):

Step 1: Image Preprocessing
    โ†“ (Clean up image: remove noise, increase contrast)
Step 2: Text Detection
    โ†“ (Find WHERE text is located in the image)
Step 3: Text Recognition
    โ†“ (Identify WHAT each character is)
Result: Editable Text!

Why Handwriting Recognition is Harder

Printed Text vs Handwritten Text:

AspectPrinted TextHandwritten Text
ConsistencyAlways same fontEveryone writes differently
ClaritySharp, clean linesCan be messy, unclear
SpacingPerfect spacingIrregular spacing
OCR Accuracy98-99%80-95% (depends on writing quality)

That’s why we use EasyOCR – it’s trained specifically to handle handwriting variations!

Understanding Confidence Scores

When OCR reads text, it gives a confidence score (0.0 to 1.0):

0.9 - 1.0 = Very confident (probably correct) โœ…
0.7 - 0.9 = Moderately confident (likely correct) โš ๏ธ
0.5 - 0.7 = Low confidence (might be wrong) โš ๏ธ
0.0 - 0.5 = Very uncertain (probably wrong) โŒ

Pro Tip: We’ll filter out results with confidence < 0.3 to avoid garbage data.


๐Ÿš€ Project Setup

Step 1: Open Google Colab

  1. Go to Google Colab
  2. Sign in with your Google account
  3. Click “New Notebook”
  4. Rename it: Form_Data_Extractor.ipynb

Why Colab?

  • โœ… No installation needed
  • โœ… Free GPU access
  • โœ… Easy file upload/download
  • โœ… Share-able notebooks

Step 2: Install Required Libraries

Copy and paste this into your first code cell:

# Installation Cell - Run this FIRST (takes 2-3 minutes)
# Only needs to be run once per session

!pip install easyocr
!pip install opencv-python-headless
!pip install pytesseract
!apt-get install tesseract-ocr
!apt-get install tesseract-ocr-tam  # Tamil language support

print("โœ… All libraries installed successfully!")

What each library does:

  • EasyOCR: Main OCR engine (reads text from images)
  • OpenCV: Image processing (cleans and prepares images)
  • Pytesseract: Backup OCR engine (good for printed text)
  • Tesseract-OCR: OCR engine core
  • Tesseract-TAM: Tamil language support

Expected Output:

โœ… All libraries installed successfully!

โฐ Time: 2-3 minutes on first run


๐Ÿ“ฆ Building the Project: Step-by-Step

Step 3: Import Libraries

Create a new code cell and add:

# Import all required libraries
import easyocr                    # OCR engine
import cv2                        # OpenCV for image processing
import numpy as np                # Numerical operations
import pandas as pd               # Data handling (Excel export)
import json                       # JSON format support
import re                         # Regular expressions (pattern matching)
from PIL import Image             # Image display
import os                         # File operations
from datetime import datetime     # Date/time handling
from google.colab import files    # File upload in Colab

print("โœ… All libraries imported successfully!")
print("=" * 80)

Why we need each one:

  • easyocr โ†’ Main text recognition
  • cv2 โ†’ Image preprocessing (grayscale, thresholding)
  • numpy โ†’ Mathematical operations on images
  • pandas โ†’ Organize data into Excel/CSV
  • json โ†’ Export as JSON format
  • re โ†’ Find patterns (phone numbers, pincodes)
  • PIL โ†’ Display images in notebook
  • os โ†’ Handle file paths
  • datetime โ†’ Timestamp our extractions
  • files โ†’ Upload forms from your computer

Step 4: Initialize OCR Reader

# Initialize EasyOCR Reader
# This downloads language models (takes 1-2 minutes first time)

print("๐Ÿ”„ Initializing OCR Reader (English + Tamil)...")
print("โณ First run may take 1-2 minutes (downloading models)...")

# Create reader for English and Tamil
# gpu=False because Colab free tier may not have GPU
reader = easyocr.Reader(['en', 'ta'], gpu=False)

print("โœ… OCR Reader initialized and ready!")
print("=" * 80)

What’s happening:

  • Downloads pre-trained models for English and Tamil
  • Models are ~100MB each
  • Only downloads once, then cached
  • gpu=False uses CPU (works on free Colab)

Real-World Analogy: Like installing a language pack on your phone – once installed, it works offline!


Step 5: Image Preprocessing Functions

Why preprocessing? Raw images often have:

  • Poor lighting
  • Background noise
  • Low contrast
  • Blur or shadows

Preprocessing fixes these issues for better OCR accuracy!

def preprocess_image(image_path):
    """
    Improves image quality for better OCR results
    
    Steps:
    1. Convert to grayscale (removes color, keeps text)
    2. Apply thresholding (makes text pure black, background pure white)
    3. Remove noise (erases tiny dots/marks)
    """
    
    # Read the image
    img = cv2.imread(image_path)
    
    # Convert to grayscale
    # Why? OCR works better with black text on white background
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # Apply Otsu's thresholding
    # Automatically finds best threshold value
    # Converts to binary image (pure black and white)
    _, thresh = cv2.threshold(gray, 0, 255, 
                              cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    
    # Remove small noise using morphological operations
    kernel = np.ones((1, 1), np.uint8)
    processed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
    
    return processed


def display_image(image_path):
    """Display the uploaded image for visual confirmation"""
    from IPython.display import display
    
    img = Image.open(image_path)
    display(img)
    
    print(f"โœ… Image loaded: {image_path}")
    print(f"๐Ÿ“ Dimensions: {img.size[0]} x {img.size[1]} pixels")
    print("=" * 80)

print("โœ… Preprocessing functions created!")

Before vs After Preprocessing:

BEFORE:                    AFTER:
[Gray, noisy image]   โ†’   [Clean, black text on white]
[Low contrast]        โ†’   [High contrast]
[Background texture]  โ†’   [Pure white background]

Step 6: Text Extraction Function (The Core!)

This is where the magic happens!

def extract_text_from_image(image_path):
    """
    Extract all text from the form image
    
    Returns:
    - raw_text: Complete text as one string
    - structured_data: List with text, position, confidence
    """
    
    print(f"๐Ÿ” Processing: {image_path}")
    print("โณ Extracting text... (30-60 seconds)")
    print("=" * 80)
    
    # Use EasyOCR to read the image
    # Returns: [([[x1,y1], [x2,y2], [x3,y3], [x4,y4]], 'text', confidence), ...]
    result = reader.readtext(image_path)
    
    # Initialize storage
    raw_text = []
    structured_data = []
    
    print(f"โœ… Detected {len(result)} text elements")
    
    # Process each detected text
    for detection in result:
        bbox, text, confidence = detection
        
        # bbox = coordinates where text is located
        # text = the extracted text
        # confidence = how sure OCR is (0.0 to 1.0)
        
        # Filter: Only keep text with confidence > 30%
        if confidence > 0.3:
            raw_text.append(text)
            
            # Get position (top-left corner)
            x = int(bbox[0][0])
            y = int(bbox[0][1])
            
            structured_data.append({
                'text': text,
                'x': x,
                'y': y,
                'confidence': round(confidence, 2)
            })
    
    # Join all text with newlines
    complete_text = '\n'.join(raw_text)
    
    print(f"โœ… Successfully extracted {len(structured_data)} high-confidence texts")
    print("=" * 80)
    
    return complete_text, structured_data

print("โœ… Text extraction function created!")

Understanding the Output:

# Example output structure:
structured_data = [
    {
        'text': 'R.ABDUL RAHEEM',
        'x': 245,
        'y': 112,
        'confidence': 0.94
    },
    {
        'text': '9680387400',
        'x': 412,
        'y': 450,
        'confidence': 0.98
    },
    # ... more text elements
]

Step 7: Smart Pattern Extraction

Extract specific information using patterns:

def extract_phone_numbers(text):
    """
    Extract all 10-digit phone numbers
    Pattern: Exactly 10 consecutive digits
    """
    phone_pattern = r'\b\d{10}\b'
    phones = re.findall(phone_pattern, text)
    return phones


def extract_pincode(text):
    """
    Extract 6-digit Indian pincode
    Pattern: Exactly 6 consecutive digits
    """
    pincode_pattern = r'\b\d{6}\b'
    pincodes = re.findall(pincode_pattern, text)
    return pincodes[0] if pincodes else None


def extract_dates(text):
    """
    Extract dates in various formats
    Patterns: DD/MM/YYYY or DD-MM-YYYY
    """
    date_pattern = r'\b\d{1,2}[/-]\d{1,2}[/-]\d{2,4}\b'
    dates = re.findall(date_pattern, text)
    return dates


def extract_email(text):
    """Extract email addresses"""
    email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
    emails = re.findall(email_pattern, text)
    return emails

print("โœ… Pattern extraction functions created!")

How Regex Patterns Work:

Pattern: \b\d{10}\b
Explained:
\b       = Word boundary (start/end of number)
\d       = Any digit (0-9)
{10}     = Exactly 10 times
\b       = Word boundary

Example matches:
โœ… 9680387400
โœ… 8123456789
โŒ 968038740 (only 9 digits)
โŒ 96803874001 (11 digits)

Step 8: Output Formatting Functions

Create clean, professional outputs:

def format_as_text(data_dict):
    """
    Format as clean, readable text
    Perfect for printing or saving as .txt file
    """
    output = "\n" + "=" * 80 + "\n"
    output += "EXTRACTED FORM DATA\n"
    output += "=" * 80 + "\n\n"
    
    for key, value in data_dict.items():
        # Convert snake_case to Title Case
        display_key = key.replace('_', ' ').title()
        output += f"{display_key}: {value}\n"
    
    output += "\n" + "=" * 80 + "\n"
    return output


def format_as_json(data_dict):
    """
    Format as JSON (useful for APIs, databases, web apps)
    """
    return json.dumps(data_dict, indent=2, ensure_ascii=False)


def format_as_excel(data_dict, filename="extracted_data.xlsx"):
    """
    Save as Excel file
    Perfect for data analysis, sharing with teams
    """
    df = pd.DataFrame([data_dict])
    df.to_excel(filename, index=False)
    print(f"โœ… Excel file saved: {filename}")
    return filename

print("โœ… Output formatting functions created!")

Output Format Comparison:

FormatBest ForFile SizeHuman Readable
TextQuick viewing, printingSmallโœ… Very
JSONAPIs, databases, web appsMediumโœ… Moderate
ExcelData analysis, sharingLargeโœ… Very

Step 9: Main Processing Pipeline (Single Image)

Combine everything into one complete workflow:

def process_single_image(image_path, output_formats=['text', 'json', 'excel']):
    """
    Complete pipeline for processing ONE form image
    
    Pipeline:
    1. Display image (visual confirmation)
    2. Extract all text using OCR
    3. Find specific patterns (phone, pincode, dates)
    4. Organize into dictionary
    5. Output in requested formats
    """
    
    print("\n" + "๐Ÿš€ STARTING SINGLE IMAGE PROCESSING\n")
    print("=" * 80)
    
    # Step 1: Display the image
    display_image(image_path)
    
    # Step 2: Extract all text
    raw_text, structured_data = extract_text_from_image(image_path)
    
    # Step 3: Create organized data dictionary
    extracted_data = {
        'image_name': os.path.basename(image_path),
        'processing_date': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'total_text_elements': len(structured_data)
    }
    
    # Step 4: Extract specific information
    print("\n๐Ÿ“‹ EXTRACTING SPECIFIC INFORMATION...\n")
    
    # Phone numbers
    phones = extract_phone_numbers(raw_text)
    if phones:
        extracted_data['mobile_numbers'] = ', '.join(phones)
        print(f"๐Ÿ“ฑ Mobile Numbers Found: {', '.join(phones)}")
    
    # Pincode
    pincode = extract_pincode(raw_text)
    if pincode:
        extracted_data['pincode'] = pincode
        print(f"๐Ÿ“ฎ Pincode Found: {pincode}")
    
    # Dates
    dates = extract_dates(raw_text)
    if dates:
        extracted_data['dates_found'] = ', '.join(dates)
        print(f"๐Ÿ“… Dates Found: {', '.join(dates)}")
    
    # Emails
    emails = extract_email(raw_text)
    if emails:
        extracted_data['emails'] = ', '.join(emails)
        print(f"๐Ÿ“ง Emails Found: {', '.join(emails)}")
    
    # Store complete raw text
    extracted_data['raw_text'] = raw_text
    
    print("\n" + "=" * 80)
    
    # Step 5: Output in requested formats
    results = {}
    
    if 'text' in output_formats:
        print("\n๐Ÿ“„ TEXT FORMAT OUTPUT:\n")
        text_output = format_as_text(extracted_data)
        print(text_output)
        results['text'] = text_output
    
    if 'json' in output_formats:
        print("\n๐Ÿ“ฆ JSON FORMAT OUTPUT:\n")
        json_output = format_as_json(extracted_data)
        print(json_output)
        results['json'] = json_output
    
    if 'excel' in output_formats:
        print("\n๐Ÿ“Š EXCEL FORMAT OUTPUT:\n")
        excel_file = format_as_excel(extracted_data)
        results['excel'] = excel_file
    
    print("\nโœ… PROCESSING COMPLETE!")
    print("=" * 80)
    
    return extracted_data, results

print("โœ… Main processing pipeline created!")

Step 10: Batch Processing (Multiple Images)

Process many forms at once:

def process_batch_images(image_paths):
    """
    Process multiple form images in one go
    
    Input: List of image file paths
    Output: Combined Excel file with all forms
    """
    
    print("\n" + "๐Ÿš€ STARTING BATCH PROCESSING\n")
    print(f"๐Ÿ“ Total images to process: {len(image_paths)}")
    print("=" * 80)
    
    all_data = []
    successful = 0
    failed = 0
    
    for idx, image_path in enumerate(image_paths, 1):
        print(f"\n๐Ÿ”„ Processing {idx}/{len(image_paths)}: {os.path.basename(image_path)}")
        print("-" * 80)
        
        try:
            # Extract text
            raw_text, structured_data = extract_text_from_image(image_path)
            
            # Create data dictionary
            data = {
                'form_number': idx,
                'image_name': os.path.basename(image_path),
                'mobile_numbers': ', '.join(extract_phone_numbers(raw_text)),
                'pincode': extract_pincode(raw_text),
                'dates_found': ', '.join(extract_dates(raw_text)),
                'emails': ', '.join(extract_email(raw_text)),
                'total_text_elements': len(structured_data),
                'raw_text': raw_text[:500]  # First 500 chars only for Excel
            }
            
            all_data.append(data)
            successful += 1
            print(f"โœ… Success: {os.path.basename(image_path)}")
            
        except Exception as e:
            failed += 1
            print(f"โŒ Error: {str(e)}")
            continue
    
    # Create combined Excel file
    print("\n" + "=" * 80)
    print("๐Ÿ“Š Creating combined Excel file...")
    
    df = pd.DataFrame(all_data)
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    output_filename = f"batch_extracted_{timestamp}.xlsx"
    df.to_excel(output_filename, index=False)
    
    print(f"\nโœ… BATCH PROCESSING COMPLETE!")
    print(f"๐Ÿ“ Output File: {output_filename}")
    print(f"โœ… Successful: {successful}")
    print(f"โŒ Failed: {failed}")
    print("=" * 80)
    
    return df, output_filename

print("โœ… Batch processing function created!")

Step 11: User Interface Functions

Easy upload and processing:

def run_single_image_mode():
    """
    User-friendly function to upload and process ONE image
    """
    print("\n" + "๐Ÿ“ค SINGLE IMAGE MODE\n")
    print("Please upload your form image (JPG, PNG, JPEG)")
    print("=" * 80 + "\n")
    
    # Upload file in Google Colab
    uploaded = files.upload()
    
    if uploaded:
        # Get uploaded filename
        image_path = list(uploaded.keys())[0]
        
        # Process the image
        extracted_data, results = process_single_image(
            image_path, 
            output_formats=['text', 'json', 'excel']
        )
        
        print("\n๐Ÿ’พ FILES READY FOR DOWNLOAD:")
        print("1. Check the 'Files' panel on the left")
        print("2. Right-click on 'extracted_data.xlsx' โ†’ Download")
        
        return extracted_data, results
    else:
        print("โŒ No file uploaded!")
        return None, None


def run_batch_mode():
    """
    User-friendly function to process MULTIPLE images
    Upload a ZIP file containing all form images
    """
    print("\n" + "๐Ÿ“ค BATCH PROCESSING MODE\n")
    print("Please upload a ZIP file containing multiple form images")
    print("=" * 80 + "\n")
    
    # Upload ZIP file
    uploaded = files.upload()
    
    if uploaded:
        zip_path = list(uploaded.keys())[0]
        
        # Extract ZIP file
        import zipfile
        extract_dir = "extracted_forms"
        os.makedirs(extract_dir, exist_ok=True)
        
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            zip_ref.extractall(extract_dir)
        
        # Find all image files
        image_extensions = ['.jpg', '.jpeg', '.png', '.JPG', '.JPEG', '.PNG']
        image_paths = []
        
        for root, dirs, files_list in os.walk(extract_dir):
            for file in files_list:
                if any(file.endswith(ext) for ext in image_extensions):
                    image_paths.append(os.path.join(root, file))
        
        print(f"โœ… Found {len(image_paths)} images in ZIP file\n")
        
        # Process all images
        df, output_file = process_batch_images(image_paths)
        
        print("\n๐Ÿ’พ DOWNLOAD YOUR RESULTS:")
        print(f"File: {output_file}")
        print("Location: Files panel (left sidebar)")
        
        return df, output_file
    else:
        print("โŒ No file uploaded!")
        return None, None

print("โœ… User interface functions created!")

๐ŸŽฎ How to Use Your OCR System

For Single Image:

# Run this cell to process ONE form
data, results = run_single_image_mode()

What happens:

  1. Upload dialog appears
  2. Select your form image
  3. Wait 30-60 seconds
  4. See results in 3 formats!
  5. Download Excel file from left panel

For Multiple Images (Batch):

# Run this cell to process MANY forms at once
df, output_file = run_batch_mode()

What happens:

  1. Upload dialog appears
  2. Select your ZIP file (containing multiple form images)
  3. System processes all forms automatically
  4. Creates ONE combined Excel file
  5. Download results!

๐Ÿ“Š Sample Output

Example Form Input:

[CSC Student Information Form Image]
Name: R.ABDUL RAHEEM
DOB: 02/12/2010
Mobile: 9680387400
Address: No:3/A Kanmathan Kail...

Text Output:

================================================================================
EXTRACTED FORM DATA
================================================================================

Image Name: student_form.jpg
Processing Date: 2025-11-29 15:30:00
Total Text Elements: 42
Mobile Numbers: 9680387400, 9680387400
Pincode: 600043
Dates Found: 02/12/2010, 08/11/2023
Raw Text: R.ABDUL RAHEEM
02/12/2010
Asfirgan Anna Centennial High School
P.RITZWAN
...

================================================================================

JSON Output:

{
  "image_name": "student_form.jpg",
  "processing_date": "2025-11-29 15:30:00",
  "total_text_elements": 42,
  "mobile_numbers": "9680387400, 9680387400",
  "pincode": "600043",
  "dates_found": "02/12/2010, 08/11/2023",
  "raw_text": "R.ABDUL RAHEEM\n02/12/2010\n..."
}

Excel Output:

Image NameMobile NumbersPincodeDates FoundRaw Text
form1.jpg968038740060004302/12/2010R.ABDUL…

๐ŸŽฏ Accuracy & Performance

Expected Accuracy:

Text TypeAccuracyNotes
Printed (English)95-98%Very reliable
Printed (Tamil)90-95%Good, may need review
Handwritten (Clear)85-92%Depends on writing quality
Handwritten (Messy)70-85%May need manual verification

Processing Time:

TaskTimeNotes
Setup (First time)3-5 minLibrary installation
Single image30-60 secDepends on image size
Batch (10 forms)5-10 minParallel processing
Batch (50 forms)20-30 minWorth the automation!

๐Ÿ’ก Tips for Better Results

Image Quality Tips:

  1. Good Lighting:
    • โœ… Natural daylight or bright white light
    • โŒ Avoid yellow/dim lighting
    • โŒ Avoid shadows on form
  2. Camera Position:
    • โœ… Hold phone directly above form (90ยฐ angle)
    • โœ… Fill entire frame with form
    • โŒ Avoid angled/tilted shots
  3. Form Condition:
    • โœ… Flat, not bent or folded
    • โœ… Clean (no coffee stains, tears)
    • โœ… High contrast (dark pen on white paper)
  4. Resolution:
    • โœ… Minimum 1000×1000 pixels
    • โœ… Higher resolution = better accuracy
    • โŒ Don’t over-compress (avoid heavy JPEG compression)

Handwriting Tips:

  1. For Students Filling Forms:
    • Use BLOCK LETTERS (not cursive)
    • Write larger than usual
    • Use dark pen (avoid pencil)
    • Stay within boxes/lines
  2. For Scanning:
    • Scan at 300 DPI or higher
    • Use grayscale mode
    • Adjust brightness/contrast if needed

๐Ÿ› Troubleshooting Common Issues

Issue 1: “No module named ‘easyocr'”

Solution:

# Re-run installation cell
!pip install easyocr

Issue 2: Low Accuracy / Wrong Text

Possible Causes:

  • Poor image quality (blur, low light)
  • Messy handwriting
  • Colored paper (use white forms)

Solutions:

  • Retake photo with better lighting
  • Increase image resolution
  • Use scanner instead of phone camera
  • Adjust confidence threshold: if confidence > 0.5: # Increase from 0.3 to 0.5

Issue 3: Missing Phone Numbers

Solution:

# Adjust phone number pattern to be more flexible
phone_pattern = r'\d{10}'  # Matches 10 digits anywhere

Issue 4: Slow Processing

Solutions:

  1. Enable GPU in Colab:
    • Runtime โ†’ Change runtime type โ†’ GPU
    • Change gpu=False to gpu=True
  2. Reduce image size: # Add before processing img = cv2.resize(img, (0,0), fx=0.5, fy=0.5)

Issue 5: Tamil Text Not Recognized

Solution:

# Verify Tamil model is loaded
reader = easyocr.Reader(['en', 'ta'], gpu=False)

# Check installed languages
print(reader.lang_list)  # Should show ['en', 'ta']

Leave a Reply

Your email address will not be published. Required fields are marked *