Ingestion workflow - provide option to enable, disable validation checks (referenced information)
Starting with recent version of "Ingestion Workflow", we see that integrity checks are enabled. This is very useful.
However, please provide option of setting true/false for "integrity checks" to the end user (Data Loader) who may or may not have required skills to tweak python code inside the DAG. Default can be left as "True".
Excerpt from Airflow Log (GCP environment) is shown below for quick reference -
[2021-03-18 17:45:08,504] {base_task_runner.py:113} INFO - Job 27967: Subtask provide_manifest_integrity_task [2021-03-18 17:45:08,503]
{validate_referential_integrity.py:156} DEBUG - Extracted reference ids:
['osdu:reference-data--AliasNameType:WELL_NAME',
'osdu:reference-data--VerticalMeasurementPath:DEPTH_DATUM_ELEV',
'osdu:reference-data--ResourceSecurityClassification:Public',
'osdu:reference-data--FacilityEventType:SPUD_DATE',
'osdu:reference-data--FacilityType:WELLBLABLA',
'osdu:master-data--Organisation:HESS']
In this example, all checks failed as the environment lacked standard Reference values at the time of this run. Else, I would only expect one reference check to fail (FacilityType = "WELLBLABLA" instead of "WELL").
[2021-03-18 17:45:44,405] {base_task_runner.py:113} INFO - Job 27967: Subtask provide_manifest_integrity_task [2021-03-18 17:45:44,405]
{validate_referential_integrity.py:177} WARNING - The next ids are absent in the system:
['osdu:reference-data--FacilityType:WELLBLABLA',
'osdu:reference-data--FacilityEventType:SPUD_DATE',
'osdu:reference-data--ResourceSecurityClassification:Public',
'osdu:reference-data--VerticalMeasurementPath:DEPTH_DATUM_ELEV',
'osdu:master-data--Organisation:HESS',
'osdu:reference-data--AliasNameType:WELL_NAME']
[2021-03-18 17:45:44,413] {base_task_runner.py:113} INFO - Job 27967: Subtask provide_manifest_integrity_task [2021-03-18 17:45:44,411]
{validate_referential_integrity.py:231} WARNING - Resource with kind odesprod:wks:master-data--Well:1.0.0 was rejected