Skip to content

Optical Character Recognition (OCR)

Unlock the power of visual data with Anyparser’s enterprise-grade OCR engine. Our advanced machine learning models deliver industry-leading accuracy across 100+ languages, handling everything from pristine business documents to challenging handwritten notes. With automatic image enhancement and intelligent layout analysis, you’ll get clean, structured data from any image or scanned document—ready for immediate use in your applications.

Key Features

🌍 Multi-Language Support

  • 100+ languages supported
  • Mixed language processing
  • Right-to-left script support

📄 Document Optimization

  • Automatic image enhancement
  • Skew correction
  • Noise reduction

🧠 Smart Recognition

  • Table structure detection
  • Form field recognition
  • Handwriting support
  • Mathematical formula parsing

🏢 Enterprise Features

  • Batch processing
  • High-volume support
  • Custom model training
  • Integration APIs

Quick Start

Get started with basic OCR processing:

from anyparser_core import Anyparser, AnyparserOption
async def main():
# Initialize with OCR options
parser = Anyparser(AnyparserOption(
model="ocr",
format="markdown"
))
# Process an image
result = await parser.parse("receipt.jpg")
# Print extracted text
print(f"Detected text:\n{result.markdown}")
print(f"Confidence score: {result.confidence_score}")
asyncio.run(main())

Advanced Configuration

Language Options

Optimize recognition for specific languages:

from anyparser_core import Anyparser, AnyparserOption, OcrLanguage
options = AnyparserOption(
model="ocr",
format="markdown",
ocr_language=[
OcrLanguage.ENGLISH, # Primary language
OcrLanguage.JAPANESE, # Secondary language
OcrLanguage.CHINESE # Additional language
]
)

Note: each additional language will incur additional API cost.

Document Presets

Optimize processing for specific document types:

class OcrPreset(Enum):
DOCUMENT = "document"
HANDWRITING = "handwriting"
SCAN = "scan"
RECEIPT = "receipt"
MAGAZINE = "magazine"
INVOICE = "invoice"
BUSINESS_CARD = "business-card"
PASSPORT = "passport"
DRIVER_LICENSE = "driver-license"
IDENTITY_CARD = "identity-card"
LICENSE_PLATE = "license-plate"
MEDICAL_REPORT = "medical-report"
BANK_STATEMENT = "bank-statement"

Best Practices

  1. Image Preparation

    • Use high-resolution images (300 DPI+)
    • Ensure good lighting and contrast
    • Minimize background noise
    • Keep documents flat and aligned
  2. Language Configuration

    • Specify primary language when known
    • Use multiple languages for mixed content
  3. Performance Optimization

    • Use appropriate presets
    • Process in batches when possible
    • Enable application-level caching for repeated scans
    • Monitor processing times
  4. Error Handling

    • Validate output format
    • Implement retry logic
    • Log processing errors

Supported Formats

Input Formats

  • Images: PNG, JPEG/JPG, TIFF, WebP, BMP

Output Formats

  • Text: Plain text, Markdown, HTML
  • Data: JSON, XML, CSV
  • Structure: Tables, Forms, Layout

Example Use Cases

📚 Document Digitization

Convert physical documents to searchable digital formats:

  • Archive scanning
  • Document management
  • Legal document processing

🤖 Data Entry Automation

Automate manual data entry tasks:

  • Form processing
  • Receipt scanning
  • Business card digitization