API Reference Overview
The API Reference tab is generated from the exported customer OpenAPI artifact at public_artifacts/openapi/customer-api-openapi.json.
Covered endpoints
- Files
- Parse, layout, OCR, classify, split
- Tables, equations, charts
- Schema builder and extraction
- Single and multi-document chat
- Agents and jobs
Each generated endpoint page includes:
- auth requirements
- sync or async behavior
- request notes
- response notes
Canonical base URL
https://api.docspeed.ai
All public Docspeed customer API requests use bearer auth.
Files and Processing
/api/v1/upload
Upload a reusable file
Upload a PDF or document image once, then reuse the returned `file_id` across parse, extract, schema builder, split, and chat requests.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with the multipart upload request.
Execution behavior
This endpoint is synchronous. A successful response returns a reusable file_id immediately.
Request notes
Upload documents as multipart form data using repeated files fields. Supported production formats are PDF, PNG, JPEG/JPG, TIFF/TIF, WebP, BMP, and GIF. Multi-page TIFF files are treated as page-indexed documents. Word and Excel files are supported through beta APIs; write to [email protected] to use native DOCX/XLSX beta workflows. Store returned file_id values and pass them into downstream parse, extract, or ask endpoints.
Response notes
The response includes files[], where each entry contains a stable file reference, original filename, content type, size, and creation timestamp.
Example request
curl -X POST "https://api.docspeed.ai/api/v1/upload" \
-H "Authorization: Bearer YOUR_DOCSPEED_API_KEY" \
-F "[email protected]"
Success response schema
{
"properties": {
"files": {
"items": {
"properties": {
"content_type": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Content Type"
},
"created_at": {
"title": "Created At",
"type": "number"
},
"file_id": {
"title": "File Id",
"type": "string"
},
"filename": {
"title": "Filename",
"type": "string"
},
"page_count": {
"default": 1,
"title": "Page Count",
"type": "integer"
},
"size_bytes": {
"title": "Size Bytes",
"type": "integer"
}
},
"required": [
"file_id",
"filename",
"size_bytes",
"created_at"
],
"title": "FileRef",
"type": "object"
},
"title": "Files",
"type": "array"
}
},
"title": "UploadFilesResponse",
"type": "object"
}
/v1/files/{file_id}
Inspect uploaded file metadata
Retrieve metadata for a previously uploaded file without re-uploading the payload.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> on every metadata lookup.
Execution behavior
This endpoint is synchronous and returns the file metadata inline.
Request notes
Use the file_id returned from POST /api/v1/upload.
Response notes
The response confirms the filename, content type, size, and creation timestamp for the stored file.
Example request
GET /v1/files/{file_id}
Authorization: Bearer YOUR_DOCSPEED_API_KEY
Success response schema
{
"properties": {
"content_type": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Content Type"
},
"created_at": {
"title": "Created At",
"type": "number"
},
"file_id": {
"title": "File Id",
"type": "string"
},
"filename": {
"title": "Filename",
"type": "string"
},
"page_count": {
"default": 1,
"title": "Page Count",
"type": "integer"
},
"size_bytes": {
"title": "Size Bytes",
"type": "integer"
}
},
"required": [
"file_id",
"filename",
"size_bytes",
"created_at"
],
"title": "FileRef",
"type": "object"
}
/v1/files/{file_id}/pages/{page_index}/image
Fetch a rendered page image
Retrieve a single rendered page image for visual review, debugging, and source-grounded UI workflows.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with the page image request.
Execution behavior
This endpoint is synchronous and returns the rendered image bytes for the requested page.
Request notes
Use the file_id from the upload response and a zero-based page_index.
Response notes
A successful response returns the page image directly so your application can render or inspect the source page.
Example request
GET /v1/files/{file_id}/pages/{page_index}/image
Authorization: Bearer YOUR_DOCSPEED_API_KEY
Success response schema
{}
/api/v1/parse
Parse documents into grounded structure
Convert PDFs and images into clean markdown chunks with drawable page regions for downstream extraction and review.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every parse request.
Execution modes
Set execution_mode to sync for inline JSON or async to receive a job_id and poll the jobs endpoints.
Request notes
Pass either a previously uploaded file_id or inline content via the input object. Use profile: grounded for richer support-aware parse output or profile: fast when agents need a faster hosted ingest path. Parse responses always expose region-level evidence through regions[] and chunks[].region_ids.
Response notes
Sync mode returns only doc_id, version, markdown, pages, regions, chunks, and customer-meaningful metadata. chunks[] are the canonical parsed blocks. regions[] are the only drawable geometry source, and chunk.region_ids maps each block to the page regions. The response does not include OCR implementation metadata, line ids, spans, artifacts, split shims, or top-level grounding objects.
Example request
{
"execution_mode": "sync",
"grounding": "ocr_lines",
"include_artifacts": true,
"input": {
"file_id": "file_123"
},
"profile": "grounded"
}
Success response schema
{
"properties": {
"chunks": {
"items": {
"properties": {
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"cells": {
"items": {
"additionalProperties": true,
"type": "object"
},
"title": "Cells",
"type": "array"
},
"chunk_id": {
"title": "Chunk Id",
"type": "string"
},
"display_label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Display Label"
},
"grid_col_count": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Grid Col Count"
},
"grid_row_count": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Grid Row Count"
},
"header_row_count": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Header Row Count"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"mapping_quality": {
"anyOf": [
{
"enum": [
"exact",
"heuristic",
"fallback"
],
"type": "string"
},
{
"type": "null"
}
],
"title": "Mapping Quality"
},
"markdown": {
"title": "Markdown",
"type": "string"
},
"page_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Page Index"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"render_grid": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Render Grid"
},
"table_html": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Table Html"
},
"type": {
"default": "text",
"title": "Type",
"type": "string"
}
},
"required": [
"chunk_id",
"markdown"
],
"title": "ParseChunk",
"type": "object"
},
"title": "Chunks",
"type": "array"
},
"doc_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Doc Id"
},
"markdown": {
"title": "Markdown",
"type": "string"
},
"metadata": {
"additionalProperties": true,
"title": "Metadata",
"type": "object"
},
"pages": {
"items": {
"properties": {
"height": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Height"
},
"image_ref": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Image Ref"
},
"ocr_median_line_height": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Ocr Median Line Height"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"width": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Width"
}
},
"required": [
"page_index"
],
"title": "ParsePage",
"type": "object"
},
"title": "Pages",
"type": "array"
},
"regions": {
"items": {
"properties": {
"display_label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Display Label"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_id": {
"title": "Region Id",
"type": "string"
},
"shape": {
"additionalProperties": true,
"title": "Shape",
"type": "object"
}
},
"required": [
"region_id",
"page_index",
"shape"
],
"title": "GroundingRegion",
"type": "object"
},
"title": "Regions",
"type": "array"
},
"version": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Version"
}
},
"required": [
"markdown"
],
"title": "ParseResponse",
"type": "object"
}
/v1/layout
Detect layout regions
Return OCR-grounded, detector-assisted layout components in reading order.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every layout request.
Execution modes
Use execution_mode: sync for inline region JSON or async for durable job tracking.
Request notes
Use mode: non_overlapping for compact page-level layout components with explicit reading order.
Response notes
Responses include page indexes and ordered layout components with stable ids, component types, bounding boxes, and scores.
Example request
{
"execution_mode": "sync",
"input": {
"file_id": "file_123"
},
"mode": "non_overlapping"
}
Success response schema
{
"properties": {
"mode": {
"enum": [
"nested",
"non_overlapping"
],
"title": "Mode",
"type": "string"
},
"pages": {
"items": {
"properties": {
"components": {
"items": {
"properties": {
"bbox": {
"items": {
"type": "integer"
},
"title": "Bbox",
"type": "array"
},
"id": {
"title": "Id",
"type": "string"
},
"reading_order_rank": {
"title": "Reading Order Rank",
"type": "integer"
},
"score": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Score"
},
"type": {
"title": "Type",
"type": "string"
}
},
"required": [
"id",
"reading_order_rank",
"type",
"bbox"
],
"title": "LayoutComponent",
"type": "object"
},
"title": "Components",
"type": "array"
},
"height": {
"title": "Height",
"type": "integer"
},
"image_ref": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Image Ref"
},
"ocr_median_line_height": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Ocr Median Line Height"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"width": {
"title": "Width",
"type": "integer"
}
},
"required": [
"page_index",
"width",
"height"
],
"title": "LayoutPage",
"type": "object"
},
"title": "Pages",
"type": "array"
}
},
"required": [
"mode"
],
"title": "LayoutResponse",
"type": "object"
}
/v1/ocr
Run grounded OCR
Extract OCR text plus stable line-level geometry from PDFs and images.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every OCR request.
Execution modes
Use sync for inline OCR JSON and async when OCR is part of a queued workflow.
Request notes
OCR accepts uploaded files or inline base64 inputs. Keep language: auto unless you have a known language constraint.
Response notes
The response returns page text plus line-level boxes and polygons for grounded downstream review.
Example request
{
"execution_mode": "sync",
"input": {
"file_id": "file_123"
},
"language": "auto",
"mode": "regular"
}
Success response schema
{
"properties": {
"language": {
"enum": [
"auto",
"global"
],
"title": "Language",
"type": "string"
},
"mode": {
"enum": [
"regular",
"classic"
],
"title": "Mode",
"type": "string"
},
"pages": {
"items": {
"properties": {
"height": {
"title": "Height",
"type": "integer"
},
"lines": {
"items": {
"properties": {
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"bbox_id": {
"title": "Bbox Id",
"type": "string"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"polygon": {
"anyOf": [
{
"items": {
"type": "number"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Polygon"
},
"text": {
"title": "Text",
"type": "string"
}
},
"required": [
"bbox_id",
"text",
"page_index"
],
"title": "OcrLine",
"type": "object"
},
"title": "Lines",
"type": "array"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"text": {
"title": "Text",
"type": "string"
},
"width": {
"title": "Width",
"type": "integer"
}
},
"required": [
"page_index",
"width",
"height",
"text"
],
"title": "OcrPage",
"type": "object"
},
"title": "Pages",
"type": "array"
}
},
"required": [
"language",
"mode"
],
"title": "OcrResponse",
"type": "object"
}
/v1/classify
Classify documents or pages
Classify markdown, PDFs, and images at the document or page level with optional debug detail.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every classify request.
Execution modes
Use sync for inline routing results or async when classification is part of a larger queue.
Request notes
Set classification_scope to document for one label per file or page for packet-routing workflows. Keep verbosity: none for production payloads.
Response notes
The response includes the predicted class and, when requested, page-level routing detail that downstream systems can use for automation.
Example request
{
"classification_scope": "page",
"execution_mode": "sync",
"input": {
"file_id": "file_123"
},
"input_type": "pdf",
"verbosity": "none"
}
Success response schema
{
"properties": {
"classification_metadata": {
"properties": {
"budget_limited": {
"default": false,
"title": "Budget Limited",
"type": "boolean"
},
"class_count": {
"default": 0,
"title": "Class Count",
"type": "integer"
},
"classification_scope": {
"default": "document",
"enum": [
"document",
"page"
],
"title": "Classification Scope",
"type": "string"
},
"processed_pages": {
"default": 1,
"title": "Processed Pages",
"type": "integer"
},
"sampled_page_indexes": {
"items": {
"type": "integer"
},
"title": "Sampled Page Indexes",
"type": "array"
},
"total_pages": {
"default": 1,
"title": "Total Pages",
"type": "integer"
},
"verbosity": {
"default": "none",
"enum": [
"none",
"debug"
],
"title": "Verbosity",
"type": "string"
}
},
"title": "ClassificationMetadata",
"type": "object"
},
"debug": {
"anyOf": [
{
"properties": {
"sampled_page_classifications": {
"items": {
"properties": {
"components_present": {
"anyOf": [
{
"items": {
"type": "string"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Components Present"
},
"confidence": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Confidence"
},
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"document_identity_hints": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Document Identity Hints"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"page_role": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Page Role"
}
},
"required": [
"page_index",
"doc_class"
],
"title": "ClassificationPage",
"type": "object"
},
"title": "Sampled Page Classifications",
"type": "array"
}
},
"title": "ClassificationDebug",
"type": "object"
},
{
"type": "null"
}
]
},
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"document_identity_hints": {
"additionalProperties": true,
"title": "Document Identity Hints",
"type": "object"
},
"input_type": {
"default": "auto",
"enum": [
"markdown",
"image",
"pdf",
"text",
"auto"
],
"title": "Input Type",
"type": "string"
},
"page_classifications": {
"items": {
"properties": {
"components_present": {
"anyOf": [
{
"items": {
"type": "string"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Components Present"
},
"confidence": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Confidence"
},
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"document_identity_hints": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Document Identity Hints"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"page_role": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Page Role"
}
},
"required": [
"page_index",
"doc_class"
],
"title": "ClassificationPage",
"type": "object"
},
"title": "Page Classifications",
"type": "array"
},
"page_count": {
"default": 1,
"title": "Page Count",
"type": "integer"
},
"page_dimensions": {
"items": {
"properties": {
"height": {
"title": "Height",
"type": "integer"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"width": {
"title": "Width",
"type": "integer"
}
},
"required": [
"page_index",
"width",
"height"
],
"title": "PageDimension",
"type": "object"
},
"title": "Page Dimensions",
"type": "array"
}
},
"required": [
"doc_class"
],
"title": "ClassificationResult",
"type": "object"
}
/v1/split
Split mixed document packets
Split a multi-document PDF packet into logical child files with reusable file references.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every split request.
Execution modes
Use sync for immediate split plans and child file references or async for long-running packet workloads.
Request notes
Pass a packet via input and keep verbosity: none unless you need debugging detail for evaluation.
Response notes
The response returns the inferred document boundaries and generated child file references that can be reused downstream.
Example request
{
"execution_mode": "sync",
"input": {
"file_id": "file_123"
},
"verbosity": "none"
}
Success response schema
{
"properties": {
"documents": {
"items": {
"properties": {
"confidence": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"title": "Confidence"
},
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"end_page": {
"title": "End Page",
"type": "integer"
},
"file_ref": {
"anyOf": [
{
"properties": {
"content_type": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Content Type"
},
"created_at": {
"title": "Created At",
"type": "number"
},
"file_id": {
"title": "File Id",
"type": "string"
},
"filename": {
"title": "Filename",
"type": "string"
},
"page_count": {
"default": 1,
"title": "Page Count",
"type": "integer"
},
"size_bytes": {
"title": "Size Bytes",
"type": "integer"
}
},
"required": [
"file_id",
"filename",
"size_bytes",
"created_at"
],
"title": "FileRef",
"type": "object"
},
{
"type": "null"
}
]
},
"reason": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Reason"
},
"start_page": {
"title": "Start Page",
"type": "integer"
},
"title": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Title"
}
},
"required": [
"start_page",
"end_page",
"doc_class"
],
"title": "SplitSegment",
"type": "object"
},
"title": "Documents",
"type": "array"
},
"page_summaries": {
"anyOf": [
{
"items": {
"additionalProperties": true,
"type": "object"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Page Summaries"
},
"split_metadata": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Split Metadata"
},
"total_pages": {
"title": "Total Pages",
"type": "integer"
}
},
"required": [
"total_pages",
"documents"
],
"title": "SplitPlan",
"type": "object"
}
/v1/tables
Extract tables with cell grounding
Return HTML tables, per-cell grounding, and page provenance for table-heavy documents.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every table extraction request.
Execution modes
Use sync for inline table JSON or async when table extraction is part of a batch workflow.
Request notes
Choose grounding: cell when reviewers need source-linked cell evidence. This is the best fit for AP and reconciliation workflows.
Response notes
Responses include HTML renderings, cell-level structures, page provenance, and grounding regions for each extracted table.
Example request
{
"boundary_source": "detector",
"execution_mode": "sync",
"grounding": "cell",
"input": {
"file_id": "file_123"
}
}
Success response schema
{
"properties": {
"grounding": {
"enum": [
"none",
"table",
"cell"
],
"title": "Grounding",
"type": "string"
},
"grounding_regions": {
"anyOf": [
{
"items": {
"properties": {
"display_label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Display Label"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_id": {
"title": "Region Id",
"type": "string"
},
"shape": {
"additionalProperties": true,
"title": "Shape",
"type": "object"
}
},
"required": [
"region_id",
"page_index",
"shape"
],
"title": "GroundingRegion",
"type": "object"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Grounding Regions"
},
"tables": {
"items": {
"properties": {
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"cells": {
"items": {
"properties": {
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"bbox_ids": {
"items": {
"type": "string"
},
"title": "Bbox Ids",
"type": "array"
},
"cell_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Cell Id"
},
"col": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Col"
},
"col_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Col Index"
},
"colspan": {
"default": 1,
"title": "Colspan",
"type": "integer"
},
"effective_region_ids": {
"items": {
"type": "string"
},
"title": "Effective Region Ids",
"type": "array"
},
"html": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Html"
},
"is_header": {
"default": false,
"title": "Is Header",
"type": "boolean"
},
"page_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Page Index"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"row": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Row"
},
"row_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Row Index"
},
"rowspan": {
"default": 1,
"title": "Rowspan",
"type": "integer"
},
"source_order": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Source Order"
},
"text": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Text"
},
"value": {
"title": "Value",
"type": "string"
}
},
"required": [
"value"
],
"title": "TableCell",
"type": "object"
},
"title": "Cells",
"type": "array"
},
"grid_col_count": {
"default": 0,
"title": "Grid Col Count",
"type": "integer"
},
"grid_row_count": {
"default": 0,
"title": "Grid Row Count",
"type": "integer"
},
"header_row_count": {
"default": 0,
"title": "Header Row Count",
"type": "integer"
},
"html": {
"title": "Html",
"type": "string"
},
"mapping_quality": {
"default": "fallback",
"enum": [
"exact",
"heuristic",
"fallback"
],
"title": "Mapping Quality",
"type": "string"
},
"page_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Page Index"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"region_type": {
"default": "table_body",
"title": "Region Type",
"type": "string"
},
"render_grid": {
"properties": {
"rows": {
"items": {
"properties": {
"cells": {
"items": {
"properties": {
"cell_id": {
"title": "Cell Id",
"type": "string"
},
"col_index": {
"title": "Col Index",
"type": "integer"
},
"colspan": {
"default": 1,
"title": "Colspan",
"type": "integer"
},
"effective_region_ids": {
"items": {
"type": "string"
},
"title": "Effective Region Ids",
"type": "array"
},
"is_header": {
"default": false,
"title": "Is Header",
"type": "boolean"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"row_index": {
"title": "Row Index",
"type": "integer"
},
"rowspan": {
"default": 1,
"title": "Rowspan",
"type": "integer"
},
"source_order": {
"title": "Source Order",
"type": "integer"
},
"value": {
"default": "",
"title": "Value",
"type": "string"
}
},
"required": [
"cell_id",
"source_order",
"row_index",
"col_index"
],
"title": "TableRenderGridCell",
"type": "object"
},
"title": "Cells",
"type": "array"
},
"is_header": {
"default": false,
"title": "Is Header",
"type": "boolean"
},
"row_index": {
"title": "Row Index",
"type": "integer"
}
},
"required": [
"row_index"
],
"title": "TableRenderGridRow",
"type": "object"
},
"title": "Rows",
"type": "array"
}
},
"title": "TableRenderGrid",
"type": "object"
},
"source_page_indexes": {
"items": {
"type": "integer"
},
"title": "Source Page Indexes",
"type": "array"
}
},
"required": [
"html"
],
"title": "TableResult",
"type": "object"
},
"title": "Tables",
"type": "array"
}
},
"required": [
"grounding"
],
"title": "TableExtractResponse",
"type": "object"
}
/v1/equations
Extract equations and formulae
Detect and serialize equations from images and PDFs while preserving source-linked provenance.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every equation request.
Execution modes
Use sync for inline equation payloads or async for queued processing.
Request notes
Provide either a file_id or inline content under input.
Response notes
Responses include extracted LaTeX, confidence, and any available source grounding for each equation.
Example request
{
"boundary_source": "detector",
"execution_mode": "sync",
"input": {
"file_id": "file_123"
}
}
Success response schema
{
"properties": {
"engine": {
"default": "equation_codegen",
"enum": [
"equation_codegen",
"parse_fallback"
],
"title": "Engine",
"type": "string"
},
"equations": {
"items": {
"properties": {
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"bbox_ids": {
"items": {
"type": "string"
},
"title": "Bbox Ids",
"type": "array"
},
"confidence": {
"default": 0.0,
"title": "Confidence",
"type": "number"
},
"display_mode": {
"default": "block",
"enum": [
"inline",
"block"
],
"title": "Display Mode",
"type": "string"
},
"equation_id": {
"title": "Equation Id",
"type": "string"
},
"latex": {
"default": "",
"title": "Latex",
"type": "string"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"source_text": {
"default": "",
"title": "Source Text",
"type": "string"
}
},
"required": [
"equation_id",
"page_index"
],
"title": "EquationResult",
"type": "object"
},
"title": "Equations",
"type": "array"
},
"grounding_regions": {
"items": {
"properties": {
"display_label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Display Label"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_id": {
"title": "Region Id",
"type": "string"
},
"shape": {
"additionalProperties": true,
"title": "Shape",
"type": "object"
}
},
"required": [
"region_id",
"page_index",
"shape"
],
"title": "GroundingRegion",
"type": "object"
},
"title": "Grounding Regions",
"type": "array"
},
"pages": {
"default": 0,
"title": "Pages",
"type": "integer"
}
},
"title": "EquationExtractResponse",
"type": "object"
}
/v1/charts
Describe charts, figures, and diagrams
Summarize visual charts and diagrams while preserving source-linked page regions.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every chart request.
Execution modes
Use sync for immediate JSON output or async for queued visual workloads.
Request notes
Chart requests accept the same input shapes as parse and OCR.
Response notes
Responses include summaries, extracted series data when available, and page/region provenance for the detected visual item.
Example request
{
"execution_mode": "sync",
"input": {
"file_id": "file_123"
}
}
Success response schema
{
"properties": {
"engine": {
"default": "chart_codegen",
"enum": [
"chart_codegen",
"parse_fallback"
],
"title": "Engine",
"type": "string"
},
"grounding_regions": {
"items": {
"properties": {
"display_label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Display Label"
},
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_id": {
"title": "Region Id",
"type": "string"
},
"shape": {
"additionalProperties": true,
"title": "Shape",
"type": "object"
}
},
"required": [
"region_id",
"page_index",
"shape"
],
"title": "GroundingRegion",
"type": "object"
},
"title": "Grounding Regions",
"type": "array"
},
"items": {
"items": {
"properties": {
"axes": {
"properties": {
"x_label": {
"default": "",
"title": "X Label",
"type": "string"
},
"y_label": {
"default": "",
"title": "Y Label",
"type": "string"
}
},
"title": "ChartAxes",
"type": "object"
},
"bbox": {
"anyOf": [
{
"items": {
"type": "integer"
},
"type": "array"
},
{
"type": "null"
}
],
"title": "Bbox"
},
"bbox_ids": {
"items": {
"type": "string"
},
"title": "Bbox Ids",
"type": "array"
},
"confidence": {
"default": 0.0,
"title": "Confidence",
"type": "number"
},
"insights": {
"items": {
"type": "string"
},
"title": "Insights",
"type": "array"
},
"item_id": {
"title": "Item Id",
"type": "string"
},
"kind": {
"default": "figure",
"enum": [
"chart",
"figure",
"diagram",
"flowchart",
"engineering_diagram"
],
"title": "Kind",
"type": "string"
},
"mermaid": {
"default": "",
"title": "Mermaid",
"type": "string"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"region_ids": {
"items": {
"type": "string"
},
"title": "Region Ids",
"type": "array"
},
"series": {
"items": {
"properties": {
"name": {
"default": "",
"title": "Name",
"type": "string"
},
"points": {
"items": {
"properties": {
"label": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Label"
},
"x": {
"anyOf": [
{},
{
"type": "null"
}
],
"title": "X"
},
"y": {
"anyOf": [
{},
{
"type": "null"
}
],
"title": "Y"
}
},
"title": "ChartSeriesPoint",
"type": "object"
},
"title": "Points",
"type": "array"
}
},
"title": "ChartSeries",
"type": "object"
},
"title": "Series",
"type": "array"
},
"summary": {
"default": "",
"title": "Summary",
"type": "string"
},
"title": {
"default": "",
"title": "Title",
"type": "string"
}
},
"required": [
"item_id",
"page_index"
],
"title": "ChartItem",
"type": "object"
},
"title": "Items",
"type": "array"
},
"pages": {
"default": 0,
"title": "Pages",
"type": "integer"
}
},
"title": "ChartDescribeResponse",
"type": "object"
}
Extraction, Chat, and Jobs
/v1/schema/auto
Generate schemas from instructions or documents
Build extraction schemas from text instructions, a single document, or a multi-document set with shared and emerging-field visibility.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every schema builder request.
Execution modes
Use sync during schema design sessions or async when you are generating schemas over larger document sets.
Request notes
Use instructions for zero-document schema design, inputs with one document for single-document schema generation, or multiple inputs to build a global schema across a document set.
Response notes
Responses return the generated schema object directly so it can be copied and reused as-is in extraction requests.
Example request
Single-document example
{
"execution_mode": "sync",
"inputs": [
{
"file_id": "file_123"
}
]
}Zero-document example
{
"execution_mode": "sync",
"instructions": "Invoice documents with line items, totals, taxes, supplier name, invoice number, and due date."
}Success response schema
{
"properties": {
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"fields": {
"items": {
"properties": {
"array": {
"default": false,
"title": "Array",
"type": "boolean"
},
"description": {
"title": "Description",
"type": "string"
},
"key": {
"title": "Key",
"type": "string"
},
"type": {
"default": "string",
"enum": [
"string",
"number",
"integer",
"boolean"
],
"title": "Type",
"type": "string"
}
},
"required": [
"key",
"description"
],
"title": "SchemaV2Field",
"type": "object"
},
"title": "Fields",
"type": "array"
},
"tables": {
"items": {
"properties": {
"description": {
"default": "",
"title": "Description",
"type": "string"
},
"fields": {
"items": {
"properties": {
"array": {
"default": false,
"title": "Array",
"type": "boolean"
},
"description": {
"title": "Description",
"type": "string"
},
"key": {
"title": "Key",
"type": "string"
},
"type": {
"default": "string",
"enum": [
"string",
"number",
"integer",
"boolean"
],
"title": "Type",
"type": "string"
}
},
"required": [
"key",
"description"
],
"title": "SchemaV2Field",
"type": "object"
},
"title": "Fields",
"type": "array"
},
"key": {
"title": "Key",
"type": "string"
}
},
"required": [
"key"
],
"title": "SchemaV2Table",
"type": "object"
},
"title": "Tables",
"type": "array"
}
},
"required": [
"doc_class"
],
"title": "SchemaV2Payload",
"type": "object"
}
/v1/extract
Run grounded extraction against a schema
Execute schema-based extraction with grounded values and repeated-row support.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every extraction request.
Execution modes
Use sync for inline extraction JSON or async when extraction should run as a tracked job.
Request notes
Pass the document in input, the extraction contract in schema, and choose grounding based on the review experience you want to build.
Response notes
Responses include extracted invoice-level fields, line-item structures, and grounding references for values and repeated rows.
Example request
{
"execution_mode": "sync",
"grounding": "cell",
"input": {
"file_id": "file_123"
},
"schema": {
"doc_class": "invoice",
"invoice_level_fields": {
"invoice_number": "string: invoice identifier",
"invoice_total": "number: full invoice total",
"supplier_name": "string: supplier or vendor"
},
"line_item_structures": {
"line_item_notes": {
"description": "Notes tied to line items.",
"target_fields": []
},
"line_items": {
"description": "Fields to extract per line item.",
"target_fields": [
"description (string): line item description",
"amount (number): line item amount",
"quantity (number): line item quantity"
]
}
}
}
}
Success response schema
{
"properties": {
"grounding_mode": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "string"
}
],
"title": "Grounding Mode"
},
"page_count": {
"default": 1,
"title": "Page Count",
"type": "integer"
},
"result": {
"additionalProperties": true,
"title": "Result",
"type": "object"
},
"schema": {
"properties": {
"doc_class": {
"title": "Doc Class",
"type": "string"
},
"fields": {
"items": {
"properties": {
"array": {
"default": false,
"title": "Array",
"type": "boolean"
},
"description": {
"title": "Description",
"type": "string"
},
"key": {
"title": "Key",
"type": "string"
},
"type": {
"default": "string",
"enum": [
"string",
"number",
"integer",
"boolean"
],
"title": "Type",
"type": "string"
}
},
"required": [
"key",
"description"
],
"title": "SchemaV2Field",
"type": "object"
},
"title": "Fields",
"type": "array"
},
"tables": {
"items": {
"properties": {
"description": {
"default": "",
"title": "Description",
"type": "string"
},
"fields": {
"items": {
"properties": {
"array": {
"default": false,
"title": "Array",
"type": "boolean"
},
"description": {
"title": "Description",
"type": "string"
},
"key": {
"title": "Key",
"type": "string"
},
"type": {
"default": "string",
"enum": [
"string",
"number",
"integer",
"boolean"
],
"title": "Type",
"type": "string"
}
},
"required": [
"key",
"description"
],
"title": "SchemaV2Field",
"type": "object"
},
"title": "Fields",
"type": "array"
},
"key": {
"title": "Key",
"type": "string"
}
},
"required": [
"key"
],
"title": "SchemaV2Table",
"type": "object"
},
"title": "Tables",
"type": "array"
}
},
"required": [
"doc_class"
],
"title": "SchemaV2Payload",
"type": "object"
}
},
"required": [
"schema",
"result",
"grounding_mode"
],
"title": "ExtractionResult",
"type": "object"
}
/api/v1/ask
Ask grounded questions on one or more documents
Answer questions against a single document or document set with source-linked citations.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every Ask request.
Execution modes
Use sync for interactive Q&A and async when Ask should run as a background job.
Request notes
Pass one or more documents in inputs, the user question in query, and keep include_grounding enabled when reviewers need citations. Document-backed Ask requests first run full-document Parse for every selected document, then index and retrieve over the parsed markdown and regions. When multiple documents are supplied, Docspeed submits those full-document Parse calls in parallel. For follow-up questions, pass the prior response session_id with the same ordered inputs to reuse prepared parsing and indexing. Generated Ask sessions use the session_id_... prefix.
Response notes
Responses include the answer text, doc_ids, an optional reusable session_id, and grounded citations. Each citation maps back to the submitted file id and exposes drawable regions[] with page_index plus bbox coordinates (x, y, width, height); clients do not need internal layout region ids to render grounding.
Example request
{
"execution_mode": "sync",
"include_grounding": true,
"inputs": [
{
"file_id": "file_123"
},
{
"file_id": "file_456"
}
],
"query": "Which supplier billed the highest amount?"
}
Success response schema
{
"properties": {
"answer": {
"title": "Answer",
"type": "string"
},
"citations": {
"items": {
"properties": {
"doc_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Doc Id"
},
"regions": {
"items": {
"properties": {
"bbox": {
"properties": {
"height": {
"title": "Height",
"type": "number"
},
"width": {
"title": "Width",
"type": "number"
},
"x": {
"title": "X",
"type": "number"
},
"y": {
"title": "Y",
"type": "number"
}
},
"required": [
"x",
"y",
"width",
"height"
],
"title": "ChatCitationBBox",
"type": "object"
},
"page_height": {
"title": "Page Height",
"type": "number"
},
"page_index": {
"title": "Page Index",
"type": "integer"
},
"page_width": {
"title": "Page Width",
"type": "number"
}
},
"required": [
"page_index",
"bbox",
"page_width",
"page_height"
],
"title": "ChatCitationRegion",
"type": "object"
},
"title": "Regions",
"type": "array"
}
},
"title": "ChatCitation",
"type": "object"
},
"title": "Citations",
"type": "array"
},
"doc_ids": {
"items": {
"type": "string"
},
"title": "Doc Ids",
"type": "array"
},
"session_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "Reusable session for follow-up Ask requests over the same ordered inputs. Generated values use the session_id_ prefix.",
"title": "Session Id"
}
},
"required": [
"answer"
],
"title": "AskResponse",
"type": "object"
}
/v1/agents/tools
List public agent tools
Return the customer-facing Docspeed tool registry available to public agents.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every tools request.
Execution behavior
This endpoint is synchronous and returns the available tool definitions inline.
Request notes
No request body is required.
Response notes
The response lists the public tool names, descriptions, and endpoint mappings available for agent definitions.
Example request
GET /v1/agents/tools
Authorization: Bearer YOUR_DOCSPEED_API_KEY
Success response schema
{
"properties": {
"tools": {
"items": {
"properties": {
"description": {
"title": "Description",
"type": "string"
},
"endpoint": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Endpoint"
},
"mcp_name": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Mcp Name"
},
"name": {
"title": "Name",
"type": "string"
}
},
"required": [
"name",
"description"
],
"title": "ToolDefinition",
"type": "object"
},
"title": "Tools",
"type": "array"
}
},
"title": "AgentToolsResponse",
"type": "object"
}
/v1/agents
Create a public Docspeed agent
Persist an agent definition that can call customer-facing Docspeed tools.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every agent creation request.
Execution behavior
This endpoint is synchronous and returns the stored agent definition inline.
Request notes
Choose a focused set of tool_names from POST /v1/agents/tools and provide instructions that describe how the agent should use them.
Response notes
The response confirms the created agent definition and the tool bindings that were persisted.
Example request
{
"description": "Agent definition that can parse, extract, and answer questions over documents.",
"instructions": "Use the available tools to parse documents, extract fields, and answer grounded questions.",
"name": "document-ops-agent",
"tool_names": [
"parse",
"extract",
"ask"
]
}
Success response schema
{
"properties": {
"agent_id": {
"title": "Agent Id",
"type": "string"
},
"created_at": {
"title": "Created At",
"type": "number"
},
"description": {
"title": "Description",
"type": "string"
},
"instructions": {
"title": "Instructions",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"tools": {
"items": {
"properties": {
"description": {
"title": "Description",
"type": "string"
},
"endpoint": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Endpoint"
},
"mcp_name": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Mcp Name"
},
"name": {
"title": "Name",
"type": "string"
}
},
"required": [
"name",
"description"
],
"title": "ToolDefinition",
"type": "object"
},
"title": "Tools",
"type": "array"
}
},
"required": [
"agent_id",
"name",
"description",
"instructions",
"created_at"
],
"title": "AgentCreateResponse",
"type": "object"
}
/v1/jobs/{job_id}
Check async job status
Inspect the status of an async parse, extraction, schema, split, or chat job.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every jobs status request.
Execution behavior
This endpoint is synchronous and returns the current job metadata inline.
Request notes
Use the job_id returned by any endpoint invoked with execution_mode: async.
Response notes
The response reports the operation, execution mode, status, and timestamps you need for polling.
Example request
GET /v1/jobs/{job_id}
Authorization: Bearer YOUR_DOCSPEED_API_KEY
Success response schema
{
"properties": {
"created_at": {
"title": "Created At",
"type": "number"
},
"execution_mode": {
"enum": [
"async",
"sync"
],
"title": "Execution Mode",
"type": "string"
},
"job_id": {
"title": "Job Id",
"type": "string"
},
"operation": {
"title": "Operation",
"type": "string"
},
"status": {
"enum": [
"pending",
"processing",
"completed",
"failed"
],
"title": "Status",
"type": "string"
},
"updated_at": {
"title": "Updated At",
"type": "number"
}
},
"required": [
"job_id",
"operation",
"status",
"execution_mode",
"created_at",
"updated_at"
],
"title": "JobRef",
"type": "object"
}
/v1/jobs/{job_id}/result
Fetch an async job result
Retrieve the completed result payload for a previously queued async job.
Auth
Send Authorization: Bearer <DOCSPEED_API_KEY> with every jobs result request.
Execution behavior
This endpoint is synchronous and returns the completed result once the job has succeeded.
Request notes
Poll GET /v1/jobs/{job_id} until the job is completed, then fetch the final payload here.
Response notes
The response body matches the success payload of the original endpoint you queued.
Example request
GET /v1/jobs/{job_id}/result
Authorization: Bearer YOUR_DOCSPEED_API_KEY
Success response schema
{}