Skip to content

n8n Integration

Anyparser integrates with n8n to enable document parsing in your automated workflows without writing any code. This guide shows you how to use Anyparser with n8n’s drag-and-drop interface.

Prerequisites

  1. An active n8n instance (cloud or self-hosted)
  2. Anyparser API credentials
  3. Basic understanding of n8n workflows

Setting Up Anyparser in n8n

1. Add Credentials

  1. Go to SettingsCredentials
  2. Click Add Credential
  3. Search for “HTTP Request”
  4. Configure the credentials:
    Name: Anyparser API
    API Key: your-api-key

2. Configure HTTP Request Node

  1. Add an “HTTP Request” node to your workflow
  2. Configure the node:
    Method: POST
    URL: https://anyparserapi.com/parse/v1
    Authentication: Anyparser API (created in step 1)
    Headers:
    Content-Type: multipart/form-data

Basic Document Parsing

Create a workflow to parse a single document:

  1. HTTP Request Node Configuration:

    {
    "format": "json",
    "model": "text",
    "image": true,
    "table": true,
    "files": ["data/document.pdf"]
    }
  2. Response Processing:

    • Add a “Set” node to extract specific fields
    • Use dot notation to access response data:
      markdown: {{$json.markdown}}
      total_characters: {{$json.total_characters}}

Advanced Workflows

1. Batch Document Processing

Process multiple documents in sequence:

  1. Split In Batches Node:

    • Configure batch size
    • Set iteration mode
  2. HTTP Request Node:

    {
    "format": "json",
    "model": "text",
    "files": {{$json.files}}
    }
  3. Merge Node:

    • Combine results
    • Aggregate statistics

2. OCR Workflow

Set up OCR processing for images and scanned documents:

  1. HTTP Request Node:

    {
    "format": "json",
    "model": "ocr",
    "ocr_language": ["eng"],
    "ocr_preset": "document",
    "files": {{$json.files}}
    }
  2. IF Node:

    • Check OCR success
    • Handle errors

3. Web Crawling Workflow

Create a web crawling workflow:

  1. HTTP Request Node:

    {
    "format": "json",
    "model": "crawler",
    "url": {{$json.url}},
    "max_depth": 2,
    "max_executions": 10
    }
  2. Filter Node:

    • Filter by status code
    • Extract specific URLs

Error Handling

Implement proper error handling in your workflows:

  1. Error Trigger Node:

    • Catch HTTP errors
    • Handle timeouts
  2. IF Node:

    Condition: {{$json.statusCode}} !== 200
  3. Send Email Node:

    • Notify on errors
    • Include error details

Example Workflows

1. Document Processing Pipeline

graph LR
A[Read File] --> B[HTTP Request]
B --> C[Process Response]
C --> D[Save Results]
B --> E[Error Handler]

2. OCR Processing Pipeline

graph LR
A[Image Input] --> B[HTTP Request]
B --> C[Extract Text]
C --> D[Validate Results]
D --> E[Store Data]

Best Practices

  1. Workflow Design

    • Use meaningful node names
    • Add comments for clarity
    • Group related nodes
    • Test with sample data
  2. Error Management

    • Add error handlers
    • Implement retries
    • Log errors
    • Set up notifications
  3. Resource Management

    • Process in batches
    • Implement rate limiting
    • Monitor API usage
    • Clean up temporary files
  4. Security

    • Secure credentials
    • Validate input data
    • Sanitize outputs
    • Monitor access

Common Use Cases

  1. Document Processing

    • Batch process documents
    • Extract specific content
    • Generate summaries
    • Convert formats
  2. OCR Processing

    • Process scanned documents
    • Extract text from images
    • Handle multiple languages
    • Validate results
  3. Web Crawling

    • Crawl websites
    • Extract content
    • Monitor changes
    • Archive data
  4. Data Integration

    • Connect to databases
    • Update CRM systems
    • Generate reports
    • Trigger notifications