Skip to content

[Ingestion] Declarative Workflows, reusable operations

Declarative Workflows, reusable operations

Status

  • Initiated
  • Proposed
  • Trialing
  • Under review
  • Approved
  • Retired

Context & Scope

Bringing data into the OSDU data platform, and enriching it for future uses is a core value of the data platform.

There are two large ingestion patterns ETL and ELT.

ETL

Benefit

Favors the use of (multiple) external tooling and minimizes dependencies on the data platform providing transformation services since the burden remains external to the system.

Consequence

  • Difficult to reuse operations since these are typically bound to specific tooling
  • The knowledge that goes into the transforms remains outside the system

ELT

Benefit

  • Keeps the knowledge of how the transforms are made within the system
  • Allows data platform users to share workflows and operations
  • Given that the data platform encourages continuous improvement in data already in the platform (enrichment), these same operations can be used post-ingestion to enrich the data.

Consequence

  • Significant development activity

Decision

The OSDU Data Platform will provide an ingestion framework supporting the ability for a data manager to define and execute workflows that can be used to bring data of various types and sources into the data platform. This framework and supporting operations will be built up over time allowing the current ETL requirement to shift towards the ELT(T) model.

Rationale

Consequences

When to revisit

Edited by Stephen Whitley (Invited Expert)
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information