[GCP Airflow] No Error Logging Recorded for Missing Reference Record During Manifest Ingestion
Manifest ingestion will not provide any ERROR logging in airflow log if if record json manifest contains non-existing reference/master data parameter.
Steps to reproduce: a) Using DAG manifest ingestion, load a master data wellbore record. Note that I inserted data.WellID which is not existing in the current database "WellID": "{{data-partition-id}}:master-data--Well:TEST_ERROR:":
BODY:
{ "executionContext": { "Payload": { "AppKey": "test-app", "data-partition-id": "{{data-partition-id}}" }, "manifest": { "kind": "{{data-partition-id}}:wks:Manifest:1.0.0", "MasterData": [ { "id": "{{data-partition-id}}:master-data--Wellbore:Test_NN_2021_09_24_01", "kind": "{{data-partition-id}}:wks:master-data--Wellbore:1.0.0", "acl": { "owners": [ "data.default.owners@{{data-partition-id}}.osdu-gcp.go3-nrg.projects.epam.com" ], "viewers": [ "data.default.viewers@{{data-partition-id}}.osdu-gcp.go3-nrg.projects.epam.com" ] }, "legal": { "legaltags": [ "{{data-partition-id}}-demo-legaltag" ], "otherRelevantDataCountries": [ "US" ] }, "data": { "WellID": "{{data-partition-id}}:master-data--Well:TEST_ERROR:", "FacilityName": "TEST_NN_1_ALIAS", "SequenceNumber": 1, "Source": "TEST_NN_1_ALIAS_SOURCE", "NameAliases": [ { "AliasName": "TEST_NN_1_ALIAS" } ] } } ] } } }
b) Run DAG Manifest POST: https://{{WORKFLOW_HOST}}/workflow/Osdu_ingest/workflowRun: { "workflowId": "ef82cba0-0e45-4df3-91bf-4df1553102d3", "runId": "5a786c6f-103e-44d3-b192-d34e3026b722", "startTimeStamp": 1632812342734, "status": "submitted", "submittedBy": "preshipping_test_user@osdu-gcp.go3-nrg.projects.epam.com" }
c) Observe the airflow log. In all stages of the log there is no indication of ERROR logging even if the DAG run is failing at the end and no new record stored. Found a trace of DEBUG logging inside the airflow which indicates some kind of records checking but no ERROR logging observed:
[2021-09-28 06:59:54,321] {search_record_ids.py:78} DEBUG - Search query "odesprod:master-data--Well:TEST_ERROR" [2021-09-28 06:59:54,365] {connectionpool.py:939} DEBUG - Starting new HTTPS connection (1): preship-asm.osdu-gcp.go3-nrg.projects.epam.com:443 [2021-09-28 06:59:56,781] {connectionpool.py:433} DEBUG - https://preship-asm.osdu-gcp.go3-nrg.projects.epam.com:443 "POST /api/search/v2/query HTTP/1.1" 200 None [2021-09-28 06:59:56,785] {search_record_ids.py:183} DEBUG - {"results":[],"aggregations":[],"totalCount":0} [2021-09-28 06:59:56,785] {search_record_ids.py:188} DEBUG - Got total count 0 [2021-09-28 06:59:56,786] {search_record_ids.py:169} DEBUG - response ids: []
EXPECTATION:
If the record is not stored due to not found existing records reference in the database, we should observe ERROR type logging in the airflow.
(TESTED ON R3M8 Preship GCP environment on 27 September 2021)