Home issueshttps://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues2022-06-28T19:49:05Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/40Apache AirFlow for ingestion2022-06-28T19:49:05ZStephen Whitley (Invited Expert)Apache AirFlow for ingestion# Using Apache Airflow to support Workflow Orchestration for Ingestion Workflows
## Status
- [ ] Initiated
- [ ] Proposed
- [X] Trialing
- [ ] Under review
- [X] Approved
- [ ] Retired
## Decision
We will use Apache Airflow for impl...# Using Apache Airflow to support Workflow Orchestration for Ingestion Workflows
## Status
- [ ] Initiated
- [ ] Proposed
- [X] Trialing
- [ ] Under review
- [X] Approved
- [ ] Retired
## Decision
We will use Apache Airflow for implementing and executing ingestion workflows. The will leverage the technology as an orchestration system and AirFlow Operators can be both built in, or developed as custom operators.
## Rationale
We need a cross platform workflow orchestration so that we can reuse both ingestion workflows and operators (tasks) in the workflows. This will be an area of high reuse across OSDU members.
## Consequences
This is an Open Source solution that will have to be managed within the OSDU Platform. For several providers, this technology will need to be configured and maintained as 3rd party technology which will create operational complications.
## When to revisit
After R3 once we have successfully implemented some ingestion workflows to assess value, flexibility, and costsM1 - Release 0.1https://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/18Ingestion Framework Guiding Principles2023-03-09T18:15:53ZStephen Whitley (Invited Expert)Ingestion Framework Guiding Principles
## Ingestion Framework Guiding Principles
These guiding principles are shared by all the SLB authored user stories reflecting the capabilities of OpenDES
* Simple, Dedicated and Efficient APIs as ingestion entry points for each data f...
## Ingestion Framework Guiding Principles
These guiding principles are shared by all the SLB authored user stories reflecting the capabilities of OpenDES
* Simple, Dedicated and Efficient APIs as ingestion entry points for each data format
* Store original high-fidelity data as is. Data in its original form must land first in the most appropriate store. Ex: - A DLIS file must land in the File DMS, A ZGY seismic survey must land in the Seismic DMS, etc. Once the data, in its original form, lands in the appropriate DMS, Parsers/Scanners/Enrichment processes can be applied to that data. The original data is stored as-is and parsers/scanner/clean-up processes will output derived entities.
* Framework should inherently enforce data lifecycle, Original ---> Well Known Structure ---> Well known Entity
* Extensibility of Parsers/Scanner/Enrichment processes through Configurations/Registrations
* Track flow and lifecycle of data in the data platformM1 - Release 0.1