Batch processing
Before running the pipeline you can populate the execution with media files (e.g. images, pdf etc.) which subsequently can be used in a pipeline. This enables a simple and efficient way for batch processing.
An execution is a state
When executing a pipeline we use the execution as a store to persist input documents (media objects) and also to store results (as an observation) from one or more pipeline runs on those documents.
Attach executions to document workflows
Sometimes a document workflow contains multiple pipelines with non-overlapping functionality. When we need to run these pipelines on the same document we can use executions to organize that.
Creating an execution
You can add meta data fields to any execution when creating it. There are three levels of meta data that can be added via the API:
- Generate an ID and set it in the
executionId
field - Add a tag in the
tag
field. When querying executions the tag can be used as a filter option. - Attach a generic JSON object to the execution. This can be anything as long as it's compatible with JSON.
Here is an example on how to create an execution:
import requests
url = "https://api.cambrion.io/v1/executions"
CAMBRION_API_KEY = "INSERT_API_KEY"
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'X-API-Key': CAMBRION_API_KEY
}
payload = json.dumps({
"executionId": "my-process-12345", # this ID can be used later
"tag": "special-workflow",
"metaData": {
"good_is_danger": False,
"user": "Lisa Meier"
}
})
requests.request("POST", url, headers=headers, data=payload)
Later the execution can be accessed via the executionId
.
Using executions as a document archive
As you add documents in form of media files to executions it is just natural to use executions as way to organize and store documents.