Skip to content

API Reference - Document Parsing

This section provides detailed information on how to interact with the Anyparser API to parse documents. Anyparser offers multiple models to extract data from various types of documents including PDFs, images, audio, and more.

Authentication

Before making any API requests, you need to authenticate by passing your API key. The key should be included in the headers of your HTTP requests.

Terminal window
Authorization: Bearer your-api-key

For further details on how to retrieve your API key, visit the account dashboard.

Endpoint: /parse

The /parse endpoint is used to process a single document, whether it’s a PDF, image, audio file, or video.

Method: POST

Request Parameters

  • file (required): The document file to be parsed.
    • Type: multipart/form-data
    • Example: document.pdf
  • model (optional): The model to be used for parsing the document.
    • Type: string
    • Options: "text", "ocr", "vlm", "lam"
    • Default: "text"
  • output_format (optional): The format for the parsed output.
    • Type: string
    • Options: "markdown", "json"
    • Default: "markdown"

Example Request

Terminal window
curl -X POST https://api.anyparser.com/parse \
-H "Authorization: Bearer your-api-key" \
-F "file=@path/to/your/document.pdf" \
-F "model=text" \
-F "output_format=markdown"

Response

The response will contain the parsed data based on the specified model and output format. Here is an example of the response:

{
"status": "success",
"data": {
"markdown": "# Document Title\n\nThis is the parsed content."
}
}

Error Handling

The API will return appropriate HTTP status codes to indicate the success or failure of the request.

  • 200 OK: Request was successful.
  • 400 Bad Request: Missing or invalid parameters.
  • 401 Unauthorized: Authentication failed.
  • 500 Internal Server Error: Something went wrong on the server.

Example Error Response

{
"status": "error",
"message": "Invalid API key."
}

Endpoint: /parse-multiple

The /parse-multiple endpoint is used to process multiple documents in a single request. This can save time when dealing with large volumes of files.

Method: POST

Request Parameters

  • files (required): A list of documents to be parsed.
    • Type: array
    • Example: ["document1.pdf", "document2.pdf"]
  • model (optional): The model to use for parsing each document.
    • Type: string
    • Default: "text"
  • output_format (optional): The format for the parsed output.
    • Type: string
    • Default: "markdown"

Example Request

Terminal window
curl -X POST https://api.anyparser.com/parse-multiple \
-H "Authorization: Bearer your-api-key" \
-F "files[]=@path/to/document1.pdf" \
-F "files[]=@path/to/document2.pdf" \
-F "model=text" \
-F "output_format=markdown"

Response

The response will be a list of parsed results for each document in the request:

{
"status": "success",
"data": [
{
"document": "document1.pdf",
"markdown": "# Document 1 Title\n\nExtracted content from document 1."
},
{
"document": "document2.pdf",
"markdown": "# Document 2 Title\n\nExtracted content from document 2."
}
]
}