Enhanced API Specs for Workflow Service
This ADR is version 1 for Workflow Service R3 open specification. Proposed changes are required for CSV Ingestor Horizon 1 :
- osdu/platform/data-flow/ingestion/csv-parser/csv-parser#5 (closed)
- osdu/platform/data-flow/ingestion/csv-parser/csv-parser#4
Story related to utilization of the changes made by proposal:
At high level, the proposal talks about:
- More aligned entity definitions for Ingestion Workflow Service R2.Proposal is to define 2 entities - workflow and workflow run and APIs defining CRUD operations against those entities.
- Introduction of one new API i.e POST /workflow to register a dag with standard operators.
We are proposing the first set of APIs and we agree that there are many more APIs needed to support future requirements like composability and reusability of the workflow components. These requirements are out of the scope of this ADR and subsequent ADRs will be created for them.
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
In OSDU R2, there is Ingestion Workflow Service(https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow) for orchestration tool( airflow) specific managerial operations.
The API specs are tightly coupled with Airflow's way of doing things. We need a more generic entity defining - workflow and workflow run.Also, R2 version does not have the capability to register a dag with standard operators.
Decision
Align on the first set of OpenAPI specs for Workflow Service.
Considering the requirements for the OSDU R3 Ingestion framework we propose to introduce four APIs.
The functionality of three of the four APIs that we are proposing already exists in OSDU R2 and we intend to keep the functionality intact and just realign those APIs with Workflow and WorkflowRun entities. Also, fully decouple those APIs from the Airflow framework. We propose to provide continued support for OSDU R2 APIs. The fourth new API that we are introducing provides an ability for a workflow provider to register a workflow (dag) with standard airflow operators.
Supported workflow operations by the new version are as follows:
CRUD operations for Workflow:
- Creation\registration of workflow. (new)
CRUD operations for Workflow Run:
- Triggering a workflow. (similar functionality exist in R2) - Similar API exists in R2 which has very limited scope i.e for each combination of the user, data, and workflow types, the API identifies a suitable DAG and then calls Airflow. Proposed API is more generic where we explicitly pass the workflowID to be triggered.
- Querying details of a workflow run. (similar functionality exist in R2)
- Updation attributes of workflow run(similar functionality exist in R2)
Rationale
Workflow Service is a wrapper functionality on the orchestrator. We need to define entities which are not specific to one orchestrator. Also, add 1 API (create workflow) to support CSV Ingestor Horizon 1.
Consequences
High-level alignment on the workflow and workflowRun entities enable us to incrementally add many more future requirements on the workflows and workflow runs. Also, the new proposed API provides an ability for a workflow provider to register a workflow (dag) with standard airflow operators.
Tradeoff Analysis - Input to decision
The new proposed version will incorporate below user actions in true sense:
- Ability to register a workflow.
- Streamlined entity definitions.
This version of service will have no negative impact when compared to current version of Workflow Service in terms of:
- Performance
- Scalability
- Reliability
Decision criteria and tradeoffs
- Usability of the APIs
- Extensibility of the APIs
- Cost of Implementation