[Ingestion] Declarative Workflows, reusable operations
Declarative Workflows, reusable operations
- Under review
Context & Scope
Bringing data into the OSDU data platform, and enriching it for future uses is a core value of the data platform.
There are two large ingestion patterns ETL and ELT.
Favors the use of (multiple) external tooling and minimizes dependencies on the data platform providing transformation services since the burden remains external to the system.
- Difficult to reuse operations since these are typically bound to specific tooling
- The knowledge that goes into the transforms remains outside the system
- Keeps the knowledge of how the transforms are made within the system
- Allows data platform users to share workflows and operations
- Given that the data platform encourages continuous improvement in data already in the platform (enrichment), these same operations can be used post-ingestion to enrich the data.
- Significant development activity
The OSDU Data Platform will provide an ingestion framework supporting the ability for a data manager to define and execute workflows that can be used to bring data of various types and sources into the data platform. This framework and supporting operations will be built up over time allowing the current ETL requirement to shift towards the ELT(T) model.