Cambrion API (1.0)
Download OpenAPI specification:Download
The official Cambrion API specification. To receive a free API key reach out at hello@cambrion.ai with brief description of your use-case.
Creates an execution
Create an execution from an ID (optional). If an execution ID is given that ID will be used, otherwise a new one is created. If the execution already exists, it will be ignored and 204 will be returned. If a new execution was created, 200 is returned with the execution ID as body.
Authorizations:
Request Body schema: application/json
Execution is the context which holds data related to a specific execution
| executionId | string ID of the execution |
| tag | string Tag to identify the execution |
| createdAt | string Creation time |
| completedAt | string Completion time |
| duration | number Duration in seconds |
| status | string Status of the current execution. Includes error message in case of error. |
object | |
| hookIds | Array of strings Optional list of hook IDs to trigger for this execution. The referenced hooks will be notified when events occur on this execution (status changes, observation updates, etc.). |
Responses
Request samples
- Payload
{- "executionId": "string",
- "tag": "string",
- "createdAt": "string",
- "completedAt": "string",
- "duration": 0,
- "status": "string",
- "metaData": { },
- "hookIds": [
- "string"
]
}Response samples
- 200
- 400
- 401
{- "executionId": "string",
- "tag": "string",
- "createdAt": "string",
- "completedAt": "string",
- "duration": 0,
- "status": "string",
- "metaData": { },
- "hookIds": [
- "string"
]
}Gets all executions
Authorizations:
query Parameters
| tag | string Filter executions by tag |
Responses
Response samples
- 200
- 400
- 401
[- {
- "executionId": "string",
- "tag": "string",
- "createdAt": "string",
- "completedAt": "string",
- "duration": 0,
- "status": "string",
- "metaData": { },
- "hookIds": [
- "string"
]
}
]Gets execution
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Responses
Response samples
- 200
- 400
- 401
{- "executionId": "string",
- "tag": "string",
- "createdAt": "string",
- "completedAt": "string",
- "duration": 0,
- "status": "string",
- "metaData": { },
- "hookIds": [
- "string"
]
}Merge a raw observation into the current observation
The raw observation is merged into the current observation context.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Request Body schema: application/json
Observation request
| executionId required | string |
Array of objects (Image Content) | |
Array of objects (Linked Document) |
Responses
Request samples
- Payload
{- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "tag": "string",
- "headers": [
- {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}
], - "rows": [
- {
- "id": null,
- "tag": null,
- "pairs": [ ],
- "entity": null,
- "refinement": null
}
], - "refinement": {
- "addedHeaders": [
- null
], - "deletedHeaderIds": [
- null
], - "addedRows": [
- null
], - "deletedRowIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "entityValue": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "keyValueSetValue": { },
- "tableValue": {
- "id": null,
- "entity": null,
- "tag": null,
- "headers": [ ],
- "rows": [ ],
- "refinement": null
}, - "tag": "string",
- "refinement": {
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "refinement": {
- "addedPairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "deletedPairIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Get observation
Get a full observation of the execution.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Responses
Response samples
- 200
- 400
- 401
- 404
{- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "tag": "string",
- "headers": [
- {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}
], - "rows": [
- {
- "id": null,
- "tag": null,
- "pairs": [ ],
- "entity": null,
- "refinement": null
}
], - "refinement": {
- "addedHeaders": [
- null
], - "deletedHeaderIds": [
- null
], - "addedRows": [
- null
], - "deletedRowIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "entityValue": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "keyValueSetValue": { },
- "tableValue": {
- "id": null,
- "entity": null,
- "tag": null,
- "headers": [ ],
- "rows": [ ],
- "refinement": null
}, - "tag": "string",
- "refinement": {
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "refinement": {
- "addedPairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "deletedPairIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}Transform an observation
Transform a raw observation into an object using a JSONata statement. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Request Body schema: text/plain
Responses
Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Transform an observation into JSON
Transform a raw observation into the corresponding JSON object. The values in the JSON object correspond to the data values in the observation. If data values are not available, the raw text is used.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Responses
Response samples
- 200
- 400
- 401
- 404
{ }Submit refinements to an observation
Submit user corrections to extracted data. Refinements can target entities, key-value pairs, tables, or key-value sets by their IDs. Refinements are additive - original data is preserved.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Request Body schema: application/json
Array of objects (Entity Refinement Item) Refinements for individual entities | |
Array of objects (Key Value Pair Refinement Item) Refinements for key-value pair associations | |
Array of objects (Table Refinement Item) Structural refinements for tables | |
Array of objects (Key Value Set Refinement Item) Structural refinements for key-value sets |
Responses
Request samples
- Payload
{- "entityRefinements": [
- {
- "entityId": "string",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [
- {
- "x": 0,
- "y": 0
}
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "keyValuePairRefinements": [
- {
- "keyEntityId": "string",
- "parentKeyValueSetId": "string",
- "refinement": {
- "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "tableRefinements": [
- {
- "tableId": "string",
- "refinement": {
- "addedHeaders": [
- {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [
- null
]
}, - "boundingBox": {
- "width": 0,
- "height": 0,
- "left": 0,
- "top": 0
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "deletedHeaderIds": [
- "string"
], - "addedRows": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "entityValue": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "keyValueSetValue": { },
- "tableValue": {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": { }
}, - "tag": "string",
- "refinement": {
- "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "refinement": {
- "addedPairs": [
- {
- "key": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "entityValue": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "keyValueSetValue": { },
- "tableValue": {
- "id": null,
- "entity": null,
- "tag": null,
- "headers": [ ],
- "rows": [ ],
- "refinement": null
}, - "tag": "string",
- "refinement": {
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "deletedPairIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "deletedRowIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "keyValueSetRefinements": [
- {
- "keyValueSetId": "string",
- "refinement": {
- "addedPairs": [
- {
- "key": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "entityValue": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSetValue": {
- "id": "string",
- "tag": "string",
- "pairs": [
- { }
], - "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "refinement": { }
}, - "tableValue": {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "headers": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "rows": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": { }
}
], - "refinement": {
- "addedHeaders": [
- {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}
], - "deletedHeaderIds": [
- "string"
], - "addedRows": [
- {
- "id": null,
- "tag": null,
- "pairs": [ ],
- "entity": null,
- "refinement": null
}
], - "deletedRowIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "refinement": {
- "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
], - "deletedPairIds": [
- "string"
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}
]
}Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Get all refinements for an observation
Retrieve a summary of all refinements that have been applied to the observation.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Responses
Response samples
- 200
- 401
- 404
{- "executionId": "string",
- "totalRefinements": 0,
- "entityRefinementCount": 0,
- "structuralRefinementCount": 0,
- "lastUpdated": "2019-08-24T14:15:22Z"
}Link results of an execution
Link contents of observation to documents in an index.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Request Body schema: application/json
Linker request
Array of objects (Match Group) | |
object | |
Array of objects (Top K Index Filter) |
Responses
Request samples
- Payload
{- "group": [
- {
- "tag": "string",
- "fields": [
- {
- "fieldName": "string",
- "clause": "MUST",
- "fuzziness": 0,
- "auto": "string",
- "filter": {
- "tag": "string",
- "label": "string",
- "regExp": "string",
- "hasData": true,
- "hasValue": true,
- "layoutType": "WORD"
}, - "collection": {
- "source": "ENTITY_TEXT"
}, - "threshold": 0,
- "num_results": 0,
- "mode": "SEARCH",
- "dimension": "EMPTY",
- "analyzer": "string"
}
], - "index": "string"
}
], - "document": { },
- "topk": [
- {
- "index": "string",
- "topk": 0
}
]
}Response samples
- 200
- 400
- 401
- 404
[- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]Submit an execution
Triggers all SUBMIT-type hooks attached to this execution. This endpoint is used to signal that an execution is ready for external processing or notification. Returns the exact payloads that were sent (or would be sent) to each hook endpoint.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Responses
Response samples
- 200
- 401
- 404
{- "hookPayloads": [
- {
- "hookId": "string",
- "hookName": "string",
- "endpoint": "string",
- "payload": { },
- "delivered": true,
- "error": "string"
}
]
}Retry an execution
Retry a failed or completed execution with a specified pipeline. This updates the existing execution, resetting its status to PENDING and scheduling it for async processing. Only executions with status ERROR or COMPLETED can be retried.
Authorizations:
path Parameters
| executionId required | string ID of an execution |
Request Body schema: application/json
Request to retry an execution with a specified pipeline
| pipelineId required | string ID of the pipeline to execute for the retry |
Responses
Request samples
- Payload
{- "pipelineId": "string"
}Response samples
- 200
- 400
- 401
- 404
{- "executionId": "string"
}Create new pipeline
Authorizations:
Request Body schema: application/json
Pipeline request
| pipelineId | string |
| name | string |
| deploy | boolean Default: true Whether to deploy the pipeline when creating/updating it |
| description | string |
| tag | string |
| version | integer |
object (PipelineDefinition) |
Responses
Request samples
- Payload
{- "pipelineId": "receipt-pipeline",
- "name": "receipt-pipeline",
- "deploy": true,
- "description": "A pipeline to extract contents from a receipt",
- "tag": "string",
- "version": 1,
- "pipelineDefinition": {
- "pipelineDefinitionId": "receipt-pipeline-definition",
- "nodes": [
- {
- "modelId": "ocr_recognizer",
- "modelName": "ocr_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "param1": 1,
- "param2": 2
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 250
}
}, - "inputs": {
- "inputName": "info_array_ocr_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_ocr_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "static_layout_recognizer",
- "modelName": "static_layout_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "targetModel": "some_model",
- "labels": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 500
}
}, - "inputs": {
- "inputName": "info_array_static_layout_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_static_layout_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_parser",
- "modelName": "entity_parser",
- "modelVersion": 1,
- "modelParameters": {
- "date": "DATE",
- "name": "STRING",
- "amount": "NUMBER"
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 750
}
}, - "inputs": {
- "inputName": "info_array_parser_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_parser_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_deduplicator",
- "modelName": "entity_deduplicator",
- "modelVersion": 1,
- "modelParameters": {
- "keys": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 1000
}
}, - "inputs": {
- "inputName": "info_array_deduplicator_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_deduplicator_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}
], - "edges": [
- {
- "id": "edge-1",
- "dataHandle": "ocr_result",
- "source": "ocr_recognizer",
- "target": "layout_recognizer",
- "sourceHandle": "info_array_ocr_output",
- "targetHandle": "info_array_static_layout_input"
}, - {
- "id": "edge-2",
- "dataHandle": "recognizer_result",
- "source": "layout_recognizer",
- "target": "entity_parser",
- "sourceHandle": "info_array_static_layout_output",
- "targetHandle": "info_array_parser_input"
}, - {
- "id": "edge-3",
- "dataHandle": "parser_result",
- "source": "entity_parser",
- "target": "entity_deduplicator",
- "sourceHandle": "info_array_parser_output",
- "targetHandle": "info_array_deduplicator_input"
}
]
}
}Response samples
- 200
- 400
- 401
- 404
{- "pipelineId": "string"
}Get a specific pipeline
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Responses
Response samples
- 200
- 400
- 401
- 404
{- "pipeline": {
- "pipelineId": "string",
- "name": "string",
- "description": "string",
- "tag": "string",
- "status": "string",
- "version": 0
}, - "pipelineDefinition": {
- "pipelineDefinitionId": "receipt-pipeline-definition",
- "nodes": [
- {
- "modelId": "ocr_recognizer",
- "modelName": "ocr_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "param1": 1,
- "param2": 2
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 250
}
}, - "inputs": {
- "inputName": "info_array_ocr_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_ocr_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "static_layout_recognizer",
- "modelName": "static_layout_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "targetModel": "some_model",
- "labels": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 500
}
}, - "inputs": {
- "inputName": "info_array_static_layout_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_static_layout_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_parser",
- "modelName": "entity_parser",
- "modelVersion": 1,
- "modelParameters": {
- "date": "DATE",
- "name": "STRING",
- "amount": "NUMBER"
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 750
}
}, - "inputs": {
- "inputName": "info_array_parser_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_parser_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_deduplicator",
- "modelName": "entity_deduplicator",
- "modelVersion": 1,
- "modelParameters": {
- "keys": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 1000
}
}, - "inputs": {
- "inputName": "info_array_deduplicator_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_deduplicator_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}
], - "edges": [
- {
- "id": "edge-1",
- "dataHandle": "ocr_result",
- "source": "ocr_recognizer",
- "target": "layout_recognizer",
- "sourceHandle": "info_array_ocr_output",
- "targetHandle": "info_array_static_layout_input"
}, - {
- "id": "edge-2",
- "dataHandle": "recognizer_result",
- "source": "layout_recognizer",
- "target": "entity_parser",
- "sourceHandle": "info_array_static_layout_output",
- "targetHandle": "info_array_parser_input"
}, - {
- "id": "edge-3",
- "dataHandle": "parser_result",
- "source": "entity_parser",
- "target": "entity_deduplicator",
- "sourceHandle": "info_array_parser_output",
- "targetHandle": "info_array_deduplicator_input"
}
]
}
}Update an existing pipeline
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Request Body schema: application/json
Pipeline request
| pipelineId | string |
| name | string |
| deploy | boolean Default: true Whether to deploy the pipeline when creating/updating it |
| description | string |
| tag | string |
| version | integer |
object (PipelineDefinition) |
Responses
Request samples
- Payload
{- "pipelineId": "receipt-pipeline",
- "name": "receipt-pipeline",
- "deploy": true,
- "description": "A pipeline to extract contents from a receipt",
- "tag": "string",
- "version": 1,
- "pipelineDefinition": {
- "pipelineDefinitionId": "receipt-pipeline-definition",
- "nodes": [
- {
- "modelId": "ocr_recognizer",
- "modelName": "ocr_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "param1": 1,
- "param2": 2
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 250
}
}, - "inputs": {
- "inputName": "info_array_ocr_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_ocr_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "static_layout_recognizer",
- "modelName": "static_layout_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "targetModel": "some_model",
- "labels": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 500
}
}, - "inputs": {
- "inputName": "info_array_static_layout_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_static_layout_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_parser",
- "modelName": "entity_parser",
- "modelVersion": 1,
- "modelParameters": {
- "date": "DATE",
- "name": "STRING",
- "amount": "NUMBER"
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 750
}
}, - "inputs": {
- "inputName": "info_array_parser_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_parser_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_deduplicator",
- "modelName": "entity_deduplicator",
- "modelVersion": 1,
- "modelParameters": {
- "keys": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 1000
}
}, - "inputs": {
- "inputName": "info_array_deduplicator_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_deduplicator_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}
], - "edges": [
- {
- "id": "edge-1",
- "dataHandle": "ocr_result",
- "source": "ocr_recognizer",
- "target": "layout_recognizer",
- "sourceHandle": "info_array_ocr_output",
- "targetHandle": "info_array_static_layout_input"
}, - {
- "id": "edge-2",
- "dataHandle": "recognizer_result",
- "source": "layout_recognizer",
- "target": "entity_parser",
- "sourceHandle": "info_array_static_layout_output",
- "targetHandle": "info_array_parser_input"
}, - {
- "id": "edge-3",
- "dataHandle": "parser_result",
- "source": "entity_parser",
- "target": "entity_deduplicator",
- "sourceHandle": "info_array_parser_output",
- "targetHandle": "info_array_deduplicator_input"
}
]
}
}Response samples
- 200
- 400
- 401
- 404
{- "pipeline": {
- "pipelineId": "string",
- "name": "string",
- "description": "string",
- "tag": "string",
- "status": "string",
- "version": 0
}, - "pipelineDefinition": {
- "pipelineDefinitionId": "receipt-pipeline-definition",
- "nodes": [
- {
- "modelId": "ocr_recognizer",
- "modelName": "ocr_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "param1": 1,
- "param2": 2
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 250
}
}, - "inputs": {
- "inputName": "info_array_ocr_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_ocr_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "static_layout_recognizer",
- "modelName": "static_layout_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "targetModel": "some_model",
- "labels": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 500
}
}, - "inputs": {
- "inputName": "info_array_static_layout_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_static_layout_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_parser",
- "modelName": "entity_parser",
- "modelVersion": 1,
- "modelParameters": {
- "date": "DATE",
- "name": "STRING",
- "amount": "NUMBER"
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 750
}
}, - "inputs": {
- "inputName": "info_array_parser_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_parser_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_deduplicator",
- "modelName": "entity_deduplicator",
- "modelVersion": 1,
- "modelParameters": {
- "keys": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 1000
}
}, - "inputs": {
- "inputName": "info_array_deduplicator_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_deduplicator_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}
], - "edges": [
- {
- "id": "edge-1",
- "dataHandle": "ocr_result",
- "source": "ocr_recognizer",
- "target": "layout_recognizer",
- "sourceHandle": "info_array_ocr_output",
- "targetHandle": "info_array_static_layout_input"
}, - {
- "id": "edge-2",
- "dataHandle": "recognizer_result",
- "source": "layout_recognizer",
- "target": "entity_parser",
- "sourceHandle": "info_array_static_layout_output",
- "targetHandle": "info_array_parser_input"
}, - {
- "id": "edge-3",
- "dataHandle": "parser_result",
- "source": "entity_parser",
- "target": "entity_deduplicator",
- "sourceHandle": "info_array_parser_output",
- "targetHandle": "info_array_deduplicator_input"
}
]
}
}Get graph representation (definition) of a pipeline
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Responses
Response samples
- 200
- 400
- 401
- 404
{- "pipelineDefinitionId": "receipt-pipeline-definition",
- "nodes": [
- {
- "modelId": "ocr_recognizer",
- "modelName": "ocr_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "param1": 1,
- "param2": 2
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 250
}
}, - "inputs": {
- "inputName": "info_array_ocr_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_ocr_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "static_layout_recognizer",
- "modelName": "static_layout_recognizer",
- "modelVersion": 1,
- "modelParameters": {
- "targetModel": "some_model",
- "labels": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 500
}
}, - "inputs": {
- "inputName": "info_array_static_layout_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_static_layout_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_parser",
- "modelName": "entity_parser",
- "modelVersion": 1,
- "modelParameters": {
- "date": "DATE",
- "name": "STRING",
- "amount": "NUMBER"
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 750
}
}, - "inputs": {
- "inputName": "info_array_parser_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_parser_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}, - {
- "modelId": "entity_deduplicator",
- "modelName": "entity_deduplicator",
- "modelVersion": 1,
- "modelParameters": {
- "keys": [
- "date",
- "name",
- "amount"
]
}, - "canvas": {
- "position": {
- "x": 0,
- "y": 1000
}
}, - "inputs": {
- "inputName": "info_array_deduplicator_input",
- "inputShape": [
- 1
], - "inputType": "STRING"
}, - "outputs": {
- "inputName": "info_array_deduplicator_output",
- "inputShape": [
- 1
], - "inputType": "STRING"
}
}
], - "edges": [
- {
- "id": "edge-1",
- "dataHandle": "ocr_result",
- "source": "ocr_recognizer",
- "target": "layout_recognizer",
- "sourceHandle": "info_array_ocr_output",
- "targetHandle": "info_array_static_layout_input"
}, - {
- "id": "edge-2",
- "dataHandle": "recognizer_result",
- "source": "layout_recognizer",
- "target": "entity_parser",
- "sourceHandle": "info_array_static_layout_output",
- "targetHandle": "info_array_parser_input"
}, - {
- "id": "edge-3",
- "dataHandle": "parser_result",
- "source": "entity_parser",
- "target": "entity_deduplicator",
- "sourceHandle": "info_array_parser_output",
- "targetHandle": "info_array_deduplicator_input"
}
]
}Execute pipeline synchronously
Execute a pipeline synchronously and return the corresponding observation. The timeout is 30 seconds. If the computation takes longer than the timeout period a timeout error will be returned.
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Request Body schema: application/json
Execution request for a pipeline
| executionId | string ID of the execution |
| tag | string Tag used to identify the resulting execution. Ignored if transient is true. |
| transient | boolean Whether to delete all execution data after pipeline completion |
| transform | string JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html |
| tryImageConversion | boolean Default: false DEPRECATED: Tries to convert the provided content to an image (e.g. PDF) |
| trySimpleText | boolean Default: false DEPRECATED: Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html |
| idempotent | boolean Default: false Whether to update the existing observation with the results from pipeline run (always true if executionId is null) |
| media | Array of strings <base64> Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
object Not active yet! | |
object (Execution Observation) The structured content of a set of media files. | |
| text | string Raw text that can be used as input in pipelines |
| hookIds | Array of strings Optional list of hook IDs to trigger for this pipeline execution. The referenced hooks will be notified when events occur on the resulting execution (status changes, observation updates, etc.). |
| entryPoint | string |
Responses
Request samples
- Payload
{- "executionId": "string",
- "tag": "string",
- "transient": true,
- "transform": "string",
- "tryImageConversion": false,
- "trySimpleText": false,
- "idempotent": false,
- "media": [
- "string"
], - "runtimeParameters": { },
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": {
- "addedHeaders": [ ],
- "deletedHeaderIds": [ ],
- "addedRows": [ ],
- "deletedRowIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}, - "text": "string",
- "hookIds": [
- "string"
], - "entryPoint": "string"
}Response samples
- 200
- 400
- 401
- 404
{- "executionId": "string",
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": {
- "addedHeaders": [ ],
- "deletedHeaderIds": [ ],
- "addedRows": [ ],
- "deletedRowIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}
}Transform an observation
Execute a pipeline synchronously and return the transformed observation
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Request Body schema: application/json
Execution request for a pipeline
| executionId | string ID of the execution |
| tag | string Tag used to identify the resulting execution. Ignored if transient is true. |
| transient | boolean Whether to delete all execution data after pipeline completion |
| transform | string JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html |
| tryImageConversion | boolean Default: false DEPRECATED: Tries to convert the provided content to an image (e.g. PDF) |
| trySimpleText | boolean Default: false DEPRECATED: Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html |
| idempotent | boolean Default: false Whether to update the existing observation with the results from pipeline run (always true if executionId is null) |
| media | Array of strings <base64> Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
object Not active yet! | |
object (Execution Observation) The structured content of a set of media files. | |
| text | string Raw text that can be used as input in pipelines |
| hookIds | Array of strings Optional list of hook IDs to trigger for this pipeline execution. The referenced hooks will be notified when events occur on the resulting execution (status changes, observation updates, etc.). |
| entryPoint | string |
Responses
Request samples
- Payload
{- "executionId": "string",
- "tag": "string",
- "transient": true,
- "transform": "string",
- "tryImageConversion": false,
- "trySimpleText": false,
- "idempotent": false,
- "media": [
- "string"
], - "runtimeParameters": { },
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": {
- "addedHeaders": [ ],
- "deletedHeaderIds": [ ],
- "addedRows": [ ],
- "deletedRowIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}, - "text": "string",
- "hookIds": [
- "string"
], - "entryPoint": "string"
}Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Transform an observation
Execute a pipeline synchronously and return the corresponding JSON object.
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Request Body schema: application/json
Execution request for a pipeline
| executionId | string ID of the execution |
| tag | string Tag used to identify the resulting execution. Ignored if transient is true. |
| transient | boolean Whether to delete all execution data after pipeline completion |
| transform | string JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html |
| tryImageConversion | boolean Default: false DEPRECATED: Tries to convert the provided content to an image (e.g. PDF) |
| trySimpleText | boolean Default: false DEPRECATED: Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html |
| idempotent | boolean Default: false Whether to update the existing observation with the results from pipeline run (always true if executionId is null) |
| media | Array of strings <base64> Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
object Not active yet! | |
object (Execution Observation) The structured content of a set of media files. | |
| text | string Raw text that can be used as input in pipelines |
| hookIds | Array of strings Optional list of hook IDs to trigger for this pipeline execution. The referenced hooks will be notified when events occur on the resulting execution (status changes, observation updates, etc.). |
| entryPoint | string |
Responses
Request samples
- Payload
{- "executionId": "string",
- "tag": "string",
- "transient": true,
- "transform": "string",
- "tryImageConversion": false,
- "trySimpleText": false,
- "idempotent": false,
- "media": [
- "string"
], - "runtimeParameters": { },
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": {
- "addedHeaders": [ ],
- "deletedHeaderIds": [ ],
- "addedRows": [ ],
- "deletedRowIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}, - "text": "string",
- "hookIds": [
- "string"
], - "entryPoint": "string"
}Response samples
- 200
- 400
- 401
- 404
{ }Execute pipeline asynchronously
Authorizations:
path Parameters
| pipelineId required | string ID of the pipeline to execute |
Request Body schema: application/json
Execution request for a pipeline
| executionId | string ID of the execution |
| tag | string Tag used to identify the resulting execution. Ignored if transient is true. |
| transient | boolean Whether to delete all execution data after pipeline completion |
| transform | string JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html |
| tryImageConversion | boolean Default: false DEPRECATED: Tries to convert the provided content to an image (e.g. PDF) |
| trySimpleText | boolean Default: false DEPRECATED: Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html |
| idempotent | boolean Default: false Whether to update the existing observation with the results from pipeline run (always true if executionId is null) |
| media | Array of strings <base64> Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
object Not active yet! | |
object (Execution Observation) The structured content of a set of media files. | |
| text | string Raw text that can be used as input in pipelines |
| hookIds | Array of strings Optional list of hook IDs to trigger for this pipeline execution. The referenced hooks will be notified when events occur on the resulting execution (status changes, observation updates, etc.). |
| entryPoint | string |
Responses
Request samples
- Payload
{- "executionId": "string",
- "tag": "string",
- "transient": true,
- "transform": "string",
- "tryImageConversion": false,
- "trySimpleText": false,
- "idempotent": false,
- "media": [
- "string"
], - "runtimeParameters": { },
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "headers": [
- null
], - "rows": [
- null
], - "refinement": {
- "addedHeaders": [ ],
- "deletedHeaderIds": [ ],
- "addedRows": [ ],
- "deletedRowIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "entities": [
- {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- {
- "key": null,
- "entityValue": null,
- "keyValueSetValue": null,
- "tableValue": null,
- "tag": null,
- "refinement": null
}
], - "entity": {
- "id": "string",
- "block": {
- "text": null,
- "geometry": null
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": null,
- "textValue": null,
- "quantityValue": null,
- "numberValue": null,
- "unitValue": null,
- "dateValue": null,
- "boolValue": null,
- "textData": null,
- "quantityData": null,
- "numberData": null,
- "unitData": null,
- "dateData": null,
- "boolData": null,
- "field": null,
- "score": null,
- "sourceIndex": null
}, - "embedding": [
- null
], - "similarity": {
- "type": null,
- "cosineSimilarity": null,
- "amountDiff": null,
- "numberDiff": null,
- "same": null
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": null,
- "correctedValue": null,
- "correctedGeometry": null,
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "refinement": {
- "addedPairs": [
- null
], - "deletedPairIds": [
- null
], - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "keyValueSets": [
- {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": {
- "points": [ ]
}, - "boundingBox": {
- "width": null,
- "height": null,
- "left": null,
- "top": null
}
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}, - "text": "string",
- "hookIds": [
- "string"
], - "entryPoint": "string"
}Response samples
- 200
{- "executionId": "string"
}Response samples
- 200
[- {
- "extractionId": "string",
- "description": "string",
- "state": "string",
- "compact": false,
- "highPrecision": false,
- "size": 1300,
- "parallelProcessing": false,
- "intelligentBatching": false,
- "targetModel": "string",
- "generationInstruct": { },
- "hookIds": [
- "string"
]
}
]Create extraction
Create a document extraction. This automatically creates a pipeline that corresponds to the instructions in the extraction.
Authorizations:
Request Body schema: application/json
Extraction request
| description | string The description of the extraction |
| compact | boolean Default: false Faster response but no confidences |
| highPrecision | boolean Default: false Used for higher precision but slower response |
| size | number Default: 1300 Image resolution in px. Higher leads to better precision but slower response. |
| parallelProcessing | boolean Default: false Pages will be processed in parallel. This leads to lower latency but context between pages will be lost. |
| intelligentBatching | boolean Default: false The AI will batch as many pages together as possible. This allows understanding content across pages. Will be ignored if parallelProcessing is true. |
| targetModel | string |
object The instruct JSON expression that is used to describe the extraction. | |
| hookIds | Array of strings List of hook IDs to attach to executions created from this extraction. |
Responses
Request samples
- Payload
{- "description": "string",
- "compact": false,
- "highPrecision": false,
- "size": 1300,
- "parallelProcessing": false,
- "intelligentBatching": false,
- "targetModel": "string",
- "generationInstruct": { },
- "hookIds": [
- "string"
]
}Response samples
- 200
- 400
- 401
- 404
{- "pipelineId": "string"
}Get specific extraction
Authorizations:
path Parameters
| extractionId required | string ID of an extraction |
Responses
Response samples
- 200
- 400
- 401
- 404
{- "extractionId": "string",
- "description": "string",
- "state": "string",
- "compact": false,
- "highPrecision": false,
- "size": 1300,
- "parallelProcessing": false,
- "intelligentBatching": false,
- "targetModel": "string",
- "generationInstruct": { },
- "hookIds": [
- "string"
]
}Update extraction
Update an existing extraction and the corresponding pipeline.
Authorizations:
path Parameters
| extractionId required | string ID of an extraction |
Request Body schema: application/json
Extraction request
| description | string The description of the extraction |
| compact | boolean Default: false Faster response but no confidences |
| highPrecision | boolean Default: false Used for higher precision but slower response |
| size | number Default: 1300 Image resolution in px. Higher leads to better precision but slower response. |
| parallelProcessing | boolean Default: false Pages will be processed in parallel. This leads to lower latency but context between pages will be lost. |
| intelligentBatching | boolean Default: false The AI will batch as many pages together as possible. This allows understanding content across pages. Will be ignored if parallelProcessing is true. |
| targetModel | string |
object The instruct JSON expression that is used to describe the extraction. | |
| hookIds | Array of strings List of hook IDs to attach to executions created from this extraction. |
Responses
Request samples
- Payload
{- "description": "string",
- "compact": false,
- "highPrecision": false,
- "size": 1300,
- "parallelProcessing": false,
- "intelligentBatching": false,
- "targetModel": "string",
- "generationInstruct": { },
- "hookIds": [
- "string"
]
}Response samples
- 200
- 400
- 401
- 404
{- "pipelineId": "string"
}Suggest changes to an extraction
Suggest changes to an extraction
Authorizations:
path Parameters
| extractionId required | string ID of an extraction |
Request Body schema: application/json
Suggest a change to an existing extraction via natural language or examples
| media | Array of strings <base64> Array of base 64 encoded media files. The media files will be used to derive a suggestion for a possible underlying extraction. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
| documentContext | string |
| instruction | string |
Array of objects (Feedback) |
Responses
Request samples
- Payload
{- "media": [
- "string"
], - "documentContext": "string",
- "instruction": "string",
- "feedback": [
- {
- "note": "string",
- "example": {
- "media": [
- "string"
], - "example": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": null,
- "tables": [ ],
- "entities": [ ],
- "keyValueSet": null,
- "keyValueSets": [ ]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}
}
}
]
}Response samples
- 200
- 400
- 401
- 404
{- "extractionId": "string",
- "description": "string",
- "state": "string",
- "compact": false,
- "highPrecision": false,
- "size": 1300,
- "parallelProcessing": false,
- "intelligentBatching": false,
- "targetModel": "string",
- "generationInstruct": { },
- "hookIds": [
- "string"
]
}Suggest changes to an extraction asynchronously
Suggest changes to an extraction asynchronously
Authorizations:
path Parameters
| extractionId required | string ID of an extraction |
Request Body schema: application/json
Suggest a change to an existing extraction via natural language or examples
| media | Array of strings <base64> Array of base 64 encoded media files. The media files will be used to derive a suggestion for a possible underlying extraction. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. |
| documentContext | string |
| instruction | string |
Array of objects (Feedback) |
Responses
Request samples
- Payload
{- "media": [
- "string"
], - "documentContext": "string",
- "instruction": "string",
- "feedback": [
- {
- "note": "string",
- "example": {
- "media": [
- "string"
], - "example": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": null,
- "tables": [ ],
- "entities": [ ],
- "keyValueSet": null,
- "keyValueSets": [ ]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}
}
}
]
}Response samples
- 200
- 400
- 401
- 404
{- "executionId": "string"
}Improve an extraction with an example
Improve an extraction with an example
Authorizations:
path Parameters
| extractionId required | string ID of an extraction |
Request Body schema: application/json
Add feedback via examples
| note | string |
object (FeedbackExample) Feedback example |
Responses
Request samples
- Payload
{- "note": "string",
- "example": {
- "media": [
- "string"
], - "example": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": null,
- "entity": null,
- "tag": null,
- "headers": [ ],
- "rows": [ ],
- "refinement": null
}
], - "entities": [
- {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "keyValueSets": [
- {
- "id": null,
- "tag": null,
- "pairs": [ ],
- "entity": null,
- "refinement": null
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}
}
}Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Get all indices
Get a list of all indices.
Authorizations:
query Parameters
| limit | integer Limits the number of indices on a page |
| offset | integer Specifies the page number of the indices to be displayed |
Responses
Response samples
- 200
[- {
- "indexId": "string",
- "semanticSearchFields": [
- "string"
]
}
]Create index
Create a new index with an optional schema.
Authorizations:
Request Body schema: application/json
Index creation request
| indexId | string Unique ID of the index. |
| semanticSearchFields | Array of strings A list of document fields that is used to semantically embed the field text. |
Responses
Request samples
- Payload
{- "indexId": "string",
- "semanticSearchFields": [
- "string"
]
}Response samples
- 400
- 401
{- "code": "string",
- "message": "string"
}Get index description
Get the description of an index. Including the data model if present.
Authorizations:
path Parameters
| indexId required | string Example: Warehouse-Index ID of the index |
Responses
Response samples
- 200
- 400
- 401
- 404
{- "indexId": "string",
- "semanticSearchFields": [
- "string"
]
}Query an index with a search string
Query an index with a search string
Authorizations:
path Parameters
| indexId required | string Example: Warehouse-Index ID of the index |
Request Body schema: application/json
Query request
object Fields used for full text search | |
object A map that maps search strings to fields. | |
| k | integer The number of results to return that are similar to the search string |
| searchPipelineId | string ID of the pipeline to be used for searching. If no pipeline is given, the default pipeline according to the number of search fields is selected. |
object (PipelineExecutionObject) The execution is a stateful environment in which media (such as images or PDF files) can be stored an used an inputs for pipelines. The ID in the request body is optional (generated if empty) and must be unique. Nothing will be persisted if transient is true. In order to trigger the pipeline either an execution ID containing valid media or base 64 encoded media under the media property have to be provided. |
Responses
Request samples
- Payload
{- "fullText": {
- "property1": "string",
- "property2": "string"
}, - "semanticSearch": {
- "property1": "string",
- "property2": "string"
}, - "k": 0,
- "searchPipelineId": "string",
- "pipelineParameters": {
- "executionId": "string",
- "tag": "string",
- "transient": true,
- "transform": "string",
- "tryImageConversion": false,
- "trySimpleText": false,
- "idempotent": false,
- "media": [
- "string"
], - "runtimeParameters": { },
- "observation": {
- "executionId": "execution-1",
- "mediaContents": [
- {
- "id": "media-1",
- "mediaId": "media-1",
- "documentPages": [
- {
- "page": 1,
- "document": {
- "id": "string",
- "tables": [
- {
- "id": null,
- "entity": null,
- "tag": null,
- "headers": [ ],
- "rows": [ ],
- "refinement": null
}
], - "entities": [
- {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}
], - "keyValueSet": {
- "id": "string",
- "tag": "string",
- "pairs": [
- null
], - "entity": {
- "id": null,
- "block": null,
- "confidence": null,
- "label": null,
- "type": null,
- "data": null,
- "embedding": [ ],
- "similarity": null,
- "layoutType": null,
- "refinement": null
}, - "refinement": {
- "addedPairs": [ ],
- "deletedPairIds": [ ],
- "comment": null,
- "timestamp": null,
- "source": null
}
}, - "keyValueSets": [
- {
- "id": null,
- "tag": null,
- "pairs": [ ],
- "entity": null,
- "refinement": null
}
]
}
}
], - "mediaHash": "string",
- "codes": [
- {
- "id": "string",
- "entity": {
- "id": "string",
- "block": {
- "text": "string",
- "geometry": {
- "polygon": null,
- "boundingBox": null
}
}, - "confidence": 0,
- "label": "string",
- "type": "STRING",
- "data": {
- "documentId": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24",
- "boolValue": true,
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24",
- "boolData": true,
- "field": "string",
- "score": 0,
- "sourceIndex": "string"
}, - "embedding": [
- 0
], - "similarity": {
- "type": "TEXT_SIM",
- "cosineSimilarity": 0,
- "amountDiff": 0,
- "numberDiff": 0,
- "same": true
}, - "layoutType": "WORD",
- "refinement": {
- "correctedText": "string",
- "correctedValue": "string",
- "correctedGeometry": {
- "polygon": null,
- "boundingBox": null
}, - "comment": "string",
- "timestamp": "2019-08-24T14:15:22Z",
- "source": "string"
}
}, - "tag": "string",
- "payload": "string",
- "type": "UPC_A"
}
], - "metaData": {
- "width": 0,
- "height": 0
}, - "label": {
- "index": 0,
- "name": "string",
- "confidence": 0
}, - "rawText": "string"
}
], - "documents": [
- {
- "document": { },
- "fields": [
- {
- "documentId": "string",
- "index": "string",
- "score": 0,
- "fieldName": "string",
- "textValue": "string",
- "quantityValue": 0,
- "numberValue": 0,
- "unitValue": "string",
- "dateValue": "2019-08-24T14:15:22Z",
- "textData": "string",
- "quantityData": 0,
- "numberData": 0,
- "unitData": "string",
- "dateData": "2019-08-24T14:15:22Z",
- "cosineSimilarity": 0,
- "quantityDiff": 0,
- "numberDiff": 0,
- "same": true,
- "entityId": "string"
}
], - "tag": "string",
- "score": 0
}
]
}, - "text": "string",
- "hookIds": [
- "string"
], - "entryPoint": "string"
}
}Response samples
- 200
- 400
- 401
- 404
[- {
- "indexId": "string",
- "source": { },
- "score": 0
}
]Get all documents
Authorizations:
path Parameters
| indexId required | string Example: Warehouse-Index ID of the index |
query Parameters
| limit | integer Limits the number of documents on a page |
| offset | integer Specifies the page number of the documents to be displayed |
Responses
Response samples
- 200
[- {
- "indexId": "string",
- "source": { },
- "score": 0
}
]Create document
Create a JSON document in an index. If the index does not exist it will be created automatically.
Authorizations:
path Parameters
| indexId required | string Example: Warehouse-Index ID of the index |
Request Body schema: application/json
Document request
| indexId | string |
object | |
| score | number |
Responses
Request samples
- Payload
{- "indexId": "string",
- "source": { },
- "score": 0
}Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Create document
Create a JSON document in an index. If the index does not exist it will be created automatically.
Authorizations:
path Parameters
| indexId required | string Example: Warehouse-Index ID of the index |
Request Body schema: application/json
Document batch request
| indexId | string |
object | |
| score | number |
Responses
Request samples
- Payload
[- {
- "indexId": "string",
- "source": { },
- "score": 0
}
]Response samples
- 400
- 401
- 404
{- "code": "string",
- "message": "string"
}Create a hook
Create a hook that will be triggered when specific events occur on executions. Hooks can notify external services via webhooks when execution status changes, observations are updated, or when submit events are triggered.
Authorizations:
Request Body schema: application/json
Hook creation or update request
| hookId | string Unique identifier for the hook (auto-generated if not provided) |
| name | string Human-readable name for the hook |
| eventType required | string (Hook Event Type) Enum: "STATUS_CHANGE" "OBSERVATION_UPDATE" "SUBMIT" The type of event that triggers the hook |
| hookType required | string (Hook Type) Enum: "WEBHOOK" "KAFKA" "NATS" The delivery mechanism for the hook |
| endpoint required | string Target URL for webhook delivery |
object Custom headers to include in webhook requests (e.g., for authentication) | |
| statusFilter | Array of strings For STATUS_CHANGE events: list of statuses that trigger the hook. If empty or not provided, the hook triggers on any status change. |
| payloadType required | string (Hook Payload Type) Enum: "RAW" "JSON" "TRANSFORM" "FORM_TRANSFORM" How to format the payload sent by the hook |
| transform | string JSONata expression to transform the observation before sending. Required when payloadType is TRANSFORM or FORM_TRANSFORM. For more information see http://docs.jsonata.org/overview.html |
| formPayloadKey | string Default: "payload" For FORM_TRANSFORM payloads: the form field name for the transformed JSON data. Defaults to "payload" if not specified. |
| formDocumentKey | string Default: "documents" For FORM_TRANSFORM payloads: the form field name prefix for media files. Multiple files will be named as {key}[0], {key}[1], etc. if there are multiple files. Defaults to "documents" if not specified. |
| enabled | boolean Default: true Whether the hook is active |
| createdAt | string <date-time> Timestamp when the hook was created (auto-set) |
Responses
Request samples
- Payload
{- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}Response samples
- 200
- 400
- 401
{- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}Response samples
- 200
- 401
[- {
- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}
]Get a specific hook
Retrieve a specific hook by its ID.
Authorizations:
path Parameters
| hookId required | string ID of a hook |
Responses
Response samples
- 200
- 401
- 404
{- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}Update a hook
Update an existing hook configuration.
Authorizations:
path Parameters
| hookId required | string ID of a hook |
Request Body schema: application/json
Hook creation or update request
| hookId | string Unique identifier for the hook (auto-generated if not provided) |
| name | string Human-readable name for the hook |
| eventType required | string (Hook Event Type) Enum: "STATUS_CHANGE" "OBSERVATION_UPDATE" "SUBMIT" The type of event that triggers the hook |
| hookType required | string (Hook Type) Enum: "WEBHOOK" "KAFKA" "NATS" The delivery mechanism for the hook |
| endpoint required | string Target URL for webhook delivery |
object Custom headers to include in webhook requests (e.g., for authentication) | |
| statusFilter | Array of strings For STATUS_CHANGE events: list of statuses that trigger the hook. If empty or not provided, the hook triggers on any status change. |
| payloadType required | string (Hook Payload Type) Enum: "RAW" "JSON" "TRANSFORM" "FORM_TRANSFORM" How to format the payload sent by the hook |
| transform | string JSONata expression to transform the observation before sending. Required when payloadType is TRANSFORM or FORM_TRANSFORM. For more information see http://docs.jsonata.org/overview.html |
| formPayloadKey | string Default: "payload" For FORM_TRANSFORM payloads: the form field name for the transformed JSON data. Defaults to "payload" if not specified. |
| formDocumentKey | string Default: "documents" For FORM_TRANSFORM payloads: the form field name prefix for media files. Multiple files will be named as {key}[0], {key}[1], etc. if there are multiple files. Defaults to "documents" if not specified. |
| enabled | boolean Default: true Whether the hook is active |
| createdAt | string <date-time> Timestamp when the hook was created (auto-set) |
Responses
Request samples
- Payload
{- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}Response samples
- 200
- 400
- 401
- 404
{- "hookId": "string",
- "name": "string",
- "eventType": "STATUS_CHANGE",
- "hookType": "WEBHOOK",
- "headers": {
- "Authorization": "Bearer your-token",
- "X-Custom-Header": "custom-value"
}, - "statusFilter": [
- "COMPLETED",
- "ERROR"
], - "payloadType": "RAW",
- "transform": "string",
- "formPayloadKey": "payload",
- "formDocumentKey": "documents",
- "enabled": true,
- "createdAt": "2019-08-24T14:15:22Z"
}Register an uploaded model (currently internal only)
Authorizations:
Request Body schema: application/json
Model request
| name | string |
object (ModelDescription) | |
object |
Responses
Request samples
- Payload
{- "name": "receipt-pipeline",
- "description": {
- "name": "string",
- "description": "string",
- "version": 0,
- "resourceTag": "string"
}, - "config": { }
}Response samples
- 401
- 404
{- "code": "string",
- "message": "string"
}Copy a registered model
Authorizations:
path Parameters
| modelName required | string Name of the model |
Request Body schema: application/json
Model Copy request
| newName | string |
object (ModelDescription) | |
object |
Responses
Request samples
- Payload
{- "newName": "receipt-pipeline",
- "newDescription": {
- "name": "string",
- "description": "string",
- "version": 0,
- "resourceTag": "string"
}, - "newConfig": { }
}Response samples
- 401
- 404
{- "code": "string",
- "message": "string"
}