API Reference - Document Parsing

This section provides detailed information on how to interact with the Anyparser API to parse documents. Anyparser offers multiple models to extract data from various types of documents including PDFs, images, audio, and more.

Authentication

Before making any API requests, you need to authenticate by passing your API key. The key should be included in the headers of your HTTP requests.

Authorization: Bearer your-api-key

For further details on how to retrieve your API key, visit the account dashboard.

Endpoint: `/parse`

The /parse endpoint is used to process a single document, whether it’s a PDF, image, audio file, or video.

Method: `POST`

Request Parameters

file (required): The document file to be parsed.
- Type: multipart/form-data
- Example: document.pdf
model (optional): The model to be used for parsing the document.
- Type: string
- Options: "text", "ocr", "vlm", "lam"
- Default: "text"
output_format (optional): The format for the parsed output.
- Type: string
- Options: "markdown", "json"
- Default: "markdown"

Example Request

curl -X POST https://api.anyparser.com/parse \
  -H "Authorization: Bearer your-api-key" \
  -F "file=@path/to/your/document.pdf" \
  -F "model=text" \
  -F "output_format=markdown"

Response

The response will contain the parsed data based on the specified model and output format. Here is an example of the response:

{
  "status": "success",
  "data": {
    "markdown": "# Document Title\n\nThis is the parsed content."
  }
}

Error Handling

The API will return appropriate HTTP status codes to indicate the success or failure of the request.

200 OK: Request was successful.
400 Bad Request: Missing or invalid parameters.
401 Unauthorized: Authentication failed.
500 Internal Server Error: Something went wrong on the server.

Example Error Response

{
  "status": "error",
  "message": "Invalid API key."
}

Endpoint: `/parse-multiple`

The /parse-multiple endpoint is used to process multiple documents in a single request. This can save time when dealing with large volumes of files.

Method: `POST`

Request Parameters

files (required): A list of documents to be parsed.
- Type: array
- Example: ["document1.pdf", "document2.pdf"]
model (optional): The model to use for parsing each document.
- Type: string
- Default: "text"
output_format (optional): The format for the parsed output.
- Type: string
- Default: "markdown"

Example Request

curl -X POST https://api.anyparser.com/parse-multiple \
  -H "Authorization: Bearer your-api-key" \
  -F "files[]=@path/to/document1.pdf" \
  -F "files[]=@path/to/document2.pdf" \
  -F "model=text" \
  -F "output_format=markdown"

Response

The response will be a list of parsed results for each document in the request:

{
  "status": "success",
  "data": [
    {
      "document": "document1.pdf",
      "markdown": "# Document 1 Title\n\nExtracted content from document 1."
    },
    {
      "document": "document2.pdf",
      "markdown": "# Document 2 Title\n\nExtracted content from document 2."
    }
  ]
}

API Reference - Document Parsing

Authentication

Endpoint: /parse

Method: POST

Request Parameters

Example Request

Response

Error Handling

Example Error Response

Endpoint: /parse-multiple

Method: POST

Request Parameters

Example Request

Response

Endpoint: `/parse`

Method: `POST`

Endpoint: `/parse-multiple`

Method: `POST`