🌍 Multi-Language Support
- 100+ languages supported
- Mixed language processing
- Right-to-left script support
Unlock the power of visual data with Anyparser’s enterprise-grade OCR engine. Our advanced machine learning models deliver industry-leading accuracy across 100+ languages, handling everything from pristine business documents to challenging handwritten notes. With automatic image enhancement and intelligent layout analysis, you’ll get clean, structured data from any image or scanned document—ready for immediate use in your applications.
🌍 Multi-Language Support
📄 Document Optimization
🧠 Smart Recognition
🏢 Enterprise Features
Get started with basic OCR processing:
from anyparser_core import Anyparser, AnyparserOption
async def main(): # Initialize with OCR options parser = Anyparser(AnyparserOption( model="ocr", format="markdown" ))
# Process an image result = await parser.parse("receipt.jpg")
# Print extracted text print(f"Detected text:\n{result.markdown}") print(f"Confidence score: {result.confidence_score}")
asyncio.run(main())
import { Anyparser, AnyparserOption } from '@anyparser/core';
async function main() { // Initialize with OCR options const parser = new Anyparser({ model: 'ocr', format: 'markdown' });
// Process an image const result = await parser.parse('receipt.jpg');
// Print extracted text console.log(`Detected text:\n${result.markdown}`); console.log(`Confidence score: ${result.confidenceScore}`);}
main().catch(console.error);
Optimize recognition for specific languages:
from anyparser_core import Anyparser, AnyparserOption, OcrLanguage
options = AnyparserOption( model="ocr", format="markdown", ocr_language=[ OcrLanguage.ENGLISH, # Primary language OcrLanguage.JAPANESE, # Secondary language OcrLanguage.CHINESE # Additional language ])
import { Anyparser, AnyparserOption, OCR_LANGUAGES } from '@anyparser/core';
const options: AnyparserOption = { model: 'ocr', format: 'markdown', ocrLanguage: [ OCR_LANGUAGES.ENGLISH, // Primary language OCR_LANGUAGES.JAPANESE // Secondary language ]};
Note: each additional language will incur additional API cost.
Optimize processing for specific document types:
class OcrPreset(Enum): DOCUMENT = "document" HANDWRITING = "handwriting" SCAN = "scan" RECEIPT = "receipt" MAGAZINE = "magazine" INVOICE = "invoice" BUSINESS_CARD = "business-card" PASSPORT = "passport" DRIVER_LICENSE = "driver-license" IDENTITY_CARD = "identity-card" LICENSE_PLATE = "license-plate" MEDICAL_REPORT = "medical-report" BANK_STATEMENT = "bank-statement"
Image Preparation
Language Configuration
Performance Optimization
Error Handling
📚 Document Digitization
Convert physical documents to searchable digital formats:
🤖 Data Entry Automation
Automate manual data entry tasks: