Skip to content
Snippets Groups Projects
Commit 39d136e2 authored by Siarhei Khaletski (EPAM)'s avatar Siarhei Khaletski (EPAM) :triangular_flag_on_post:
Browse files

GONRG-756: Added Licence link

parent 4204be2b
No related branches found
No related tags found
1 merge request!5README.md updates (GONRG-756)
Pipeline #11752 failed
......@@ -21,6 +21,8 @@
* * [Workflow Status Operator](#workflow-status-operator)
* * [Stale Jobs Scheduler](#stale-jobs-scheduler)
* * [Workflow Finished Sensor operator](#workflow-finished-sensor-operator)
* [Licence](#licence)
## Introduction
......@@ -53,12 +55,14 @@ Environment dependencies might be installed by several ways:
2. Setting up an environment into the Cloud Composer Console.
3. Installing local Python library. Put your dependencies into *DAG_FOLDER/libs* directory. Airflow automatically adds *DAG_FOLDER* and *PLUGINS_FOLDER* to the *PATH*.
## DAG Implementation Details
OSDU DAGs are cloud platform-agnostic by design. However, there are specific implementation requirements by cloud
platforms, and the OSDU R2 Prototype provides a dedicated Python SDK to make sure that DAGs are independent from the
cloud platforms. This Python SDK is located in a separate [os-python-sdk] folder.
## Required Variables
### Internal Services
Some of the operators send requests to internal services. Hosts and endpoints are sepcified into Airflow Variables.
......@@ -79,6 +83,7 @@ Some of the operators send requests to internal services. Hosts and endpoints ar
| provider | Need to properly initialize OSDU SDK |
|entitlements_module_name | Need to properly initialize OSDU SDK |
## Testing
### Running Unit Tests
~~~
......@@ -88,7 +93,6 @@ tests/./set_airflow_env.sh
chmod +x tests/test_dags.sh && tests/./test_dags.sh
~~~
### Running E2E Tests
~~~
tests/./set_airflow_env.sh
......@@ -118,7 +122,6 @@ The Opaque Ingestion DAG flow:
status to **finished** in the database.
### Manifest Ingestion DAG
The Manifest Ingestion DAG ingests multiple files with their metadata provided in an OSDU manifest. The following
diagram demonstrates the workflow of the Manifest
Ingestion DAG.
......@@ -149,9 +152,9 @@ Upon an execution request:
6. Invoke the Workflow Status Operator with the **finished** job status.
* The Workflow Status Operator queries the Workflow service to set the new workflow status.
## Operators Description
### Workflow Status Operator
The Workflow Status Operator is an Airflow operator callable from each DAG. It's purpose is to receive the latest status
of a workflow job and then update the workflow record in the database. Each DAG in the system has to invoke the Workflow
Status Operator to update the workflow status.
......@@ -160,7 +163,6 @@ This operator isn't designed to directly update the status in the database, and
service's API endpoint. Once the operator sends a request to update status, it cedes control back to the DAG.
### Stale Jobs Scheduler
The Stale Jobs Scheduler is designed to query Apache Airflow to find out any stale workflow jobs, that is, the jobs that
failed during execution but which status wasn't updated to **failed** in the database.
......@@ -175,13 +177,13 @@ The Stale Jobs Scheduler workflow:
3. If Airflow returns the failed status for a workflow job, query Firestore to set the workflow status to FAILED.
### Workflow Finished Sensor Operator
The Workflow Finished Sensor operator is a special type of operator that monitors ingestion of a file during the "osdu"
ingestion workflow. Once a file is ingested, this operator notifies the DAG, which then starts ingestion of the next
file in the manifest.
[os-python-sdk]: ../os-python-sdk
## Licence
Copyright © Google LLC
Copyright © EPAM Systems
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment