Ability to replace surrogate-key ids before storing resource to Storage
Airflow DAG will be able to replace resource “Id” parameter in surrogate-key format into a system-generated “Id” format during ingestion.
Some details on the logic:
Master, Reference data – replacement of “id” field in the corresponding schema
WP ingestion:
Dataset should be stored, Dataset system-generated id should be obtained. DAG should replace:
Dataset id in the “Dataset” array in the Manifest schema (https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/Manifest.1.0.0.json )
Id values in the “Datasets” array in the WPC schema (https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/GenericWorkProductComponent.1.0.0.json )
WPC should be stored, WPC system-generated id should be obtained. DAG should replace:
“Id” value in the GenericWorkProductComponent schema (https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/GenericWorkProductComponent.1.0.0.json )
WPC id in the “Components” array in the GenericWorkProduct schema ( https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/GenericWorkProduct.1.0.0.json )
Artefact should be stored, “Id” value should be replaced in the “ResourceId” property in the “Artefacts” array in the GenericWorkProductComponent schema (https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/GenericWorkProductComponent.1.0.0.json )
WP should be stored, “Id’ value should be replaced in the “Id” property in GenericWorkProduct schema ( https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Authoring/manifest/GenericWorkProduct.1.0.0.json )