@@ -176,6 +176,38 @@ In this quickstart guide, we will use the [open-test-data](https://community.ope
}
```
#### Batch Mode for Manifest-Based Ingestion
In the example above, `executionContext.manifest` is a dictionary representing a single manifest (`osdu:wks:Manifest:1.0.0`) object. In this mode, the Osdu_ingest DAG follows a single execution path to process this individual manifest. However, the DAG also supports processing multiple manifests simultaneously by accepting a list of manifests, enabling batch mode for parallel processing.
When dealing with large datasets, it is efficient to split data across multiple manifest files and pass them as an array to the workflow service. This approach triggers batch mode, where each manifest is processed in parallel, significantly improving ingestion speed.
Here’s an example of a batch mode input format for the workflow run API (simplified for illustration):
```json
{
"executionContext":{
"Payload":{
"AppKey":"test-app",
"data-partition-id":"{{data-partition-id}}"
},
"manifest":[
{
"kind":"{{authority}}:wks:Manifest:1.0.0",
"MasterData":[...]
},
{
"kind":"{{authority}}:wks:Manifest:1.0.0",
"MasterData":[...]
},
...
]
}
}
```
## Tutorial
This section runs through the common tasks in data loading and ingestions. Refer to the links in each section to dive deeper.