Documentation /docs/schema-builder

Schema Builder

POST /v1/schema/auto supports three real modes:

  • zero_doc: start from text instructions
  • single_doc: infer a schema from one document
  • multi_doc_global: infer a shared schema across a set of documents

All modes now return the schema object directly so you can copy-paste it into POST /v1/extract as-is.

cURL

curl -sS -X POST "https://api.docspeed.ai/v1/schema/auto" \
  -H "Authorization: Bearer ${DOCSPEED_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      {"file_id": "file_april"},
      {"file_id": "file_may"},
      {"file_id": "file_june"}
    ],
    "execution_mode": "sync"
  }'

Python

import requests

response = requests.post(
    "https://api.docspeed.ai/v1/schema/auto",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "inputs": [
            {"file_id": "file_april"},
            {"file_id": "file_may"},
            {"file_id": "file_june"},
        ],
        "execution_mode": "sync",
    },
    timeout=300,
)
print(response.json())

TypeScript

const response = await fetch("https://api.docspeed.ai/v1/schema/auto", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    inputs: [
      { file_id: "file_april" },
      { file_id: "file_may" },
      { file_id: "file_june" },
    ],
    execution_mode: "sync",
  }),
});

console.log(await response.json());

Example response

{
  "doc_class": "invoice",
  "fields": [
    {
      "key": "invoice.invoice_number",
      "description": "Invoice number",
      "type": "string",
      "array": false
    }
  ],
  "tables": [
    {
      "key": "line_items",
      "description": "Invoice line items",
      "fields": [
        {
          "key": "description",
          "description": "Line item description",
          "type": "string",
          "array": false
        }
      ]
    }
  ]
}

Where it fits

  • Use zero-doc mode when a customer knows the target fields before sharing examples.
  • Use single-doc mode to bootstrap from a representative invoice or statement.
  • Use multi-doc mode when you want a single shared schema across a live document set.