ADR: Implement Airflow facade endpoint
Context
OSDU Platform uses Apache Airflow for orchestration of various data ingestion and processing jobs.
Problem statement
Currently OSDU Airflow component does not support data isolation for multi-tenant deployments. Airflow Administrative UI is available for all users and makes possible to observe all the processing data for all existing tenants which may cause data leaks and security issues.
Proposal of the solution
It is proposed to introduce a facade that will replace Airflow admin UI and will collect in a tenant-specific way via the Airflow REST API job execution information (namely its resulting x-com variables). To do this we need to add a new endpoint in the Workflow service API, which will collect the details of the DAG run using the existing Airflow REST API v2.
New API endpoint /v1/workflow/{workflow_name}/workflowRun/{runId}/lastInfo should implement the following business logic:
- Get internal workflow entity with getWorkflowRunByName and check if submittedBy corresponds to the user submitted in the header, otherwise return 401 NOT_AUTHORIZED
- Get list of all task instances with /dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances where dag_id is workflow_name and dag_run_id is runId
- Select task instance with maximal end_date
- With task_id of the selected task instance get list of xcom entries keys /dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances/{task_id}/xcomEntries
- Obtain xcom values by theis keys using /dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances/{task_id}/xcomEntries/{xcom_key}
- Return task instance details from step 3 combined with xcom values map in a single JSON responce