Skip to content

Azure Packaged Dag Support

harshit aggarwal requested to merge haaggarw/enable_packaged_dag into master

The changes made in this MR are in line with this approved ADR

AS per the ADR we plan to maintain the following structure for DAG repos

├── osdu_dag
│   ├── __init__.py
│   ├── custom_lib
│   │   ├── __init__.py
│   │   └── utils.py
│   └── operators
│       ├── __init__.py
│       └── customOperator1.py
└──  test_dag.py

This would look something like this in CSV parser repo

├── airflowdags
    ├── osdu_csv_parser
    |    ├── __init__.py
    |    └── xyz.py
    | 
    |___csv_ingestion_all_steps.py

Note: osdu_dag folder is equivalent osdu_csv_parser folder while test_dag is the actual dag file csv_ingestion_all_steps.py in csv parser

The contents inside the airflow dags folder will be packaged/zipped and uploaded to airflow

Here the csv_ingestion_all_steps is the actual DAG file (as is has to be maintained at root of the folder, while osdu_csv_parser will contain any python files etc. if required which for now just contains init.py file

The existing folders dags (airflowdags/dags) and plugins (airflow/plugins) are kept as it is because changing them might cause pipeline failures for other CSP

We plan to keep the duplicated DAG file for now and once this MR is merged we request the CSPs to start adhering to the proposed structure, once the new structure is fully supported by all CSPs we can go ahead and delete the dags and plugins folders to remove any duplicated files

cc: @kibattul

Edited by harshit aggarwal

Merge request reports