
Data Ingestion
These are the ingestion services for the data ecosystem (bringing in new data)
- I
The Ingestion Workflow service provides a wrapper functionality around the Apache Airflow functions and is designed to carry out preliminary work with files before running the Airflow Directed Acyclic Graphs (DAGs) that will perform actual ingestion of OSDU data.
- M
The Manifest Ingestion DAG includes a Workflow Engine, an implementation of Apache Airflow, to orchestrate manifest/metadata ingestion via Storage Service
- M
[Experimental] It is a DAG for Manifest Ingestion that implements the same functionality as Basic Manifiest Ingestion, but it uses KubernetesPodOperator and doesn't require installing any dependencies into Airflow Environment
- R
[Experimental] The project contains Airflow DAG Raster Ingestion that is outcome (ML POC) of EPAM team.