Azure M25: EDS Ingestion Failure: Exception thrown while validating for ExternalReferenceValueMapping
eds_ingest workflow triggered with payload:
{
"executionContext": {
"connectedSourceDataJobId": "opendes:master-data--ConnectedSourceDataJob:arpit_singh_testing_Katalyst_master"
}
}
This is a Osdu wrapper for Katalyst iGlass source.
dag_run_id=56d48fb1-ca3d-4dd9-85f0-203371ae3f67
.
This is an ingestion for a single master-data record. The ingestion crashes with an unhandled exception thrown while validating ExternalReferenceValueMapping.
This record does not require ReferenceValueMapping as all of the reference-data references already exist in the Azure M25 environment.
Relevant stacktrace from logs.
[2025-04-11, 18:32:31 UTC] {fetch_reference_mapping.py:56} INFO - In Fetch ExternalReferenceValueMapping --End
[2025-04-11, 18:32:31 UTC] {reference_mapping.py:47} INFO - Reference Data Mapping - start
[2025-04-11, 18:32:31 UTC] {reference_mapping.py:108} INFO - Reference Data Mapping Search - Start
[2025-04-11, 18:32:31 UTC] {airflow_logger.py:76} INFO - search_response_text_type <class 'dict'>
[2025-04-11, 18:32:31 UTC] {fetch_reference_mapping.py:56} INFO - In Fetch ExternalReferenceValueMapping --End
[2025-04-11, 18:32:31 UTC] {src_dags_fetch_and_ingest.py:216} ERROR - Unexpected error in main file: 1 validation error for ExternalReferenceValueMapping
data.SimpleMap.ReferenceValueID
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/string_type
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - Traceback (most recent call last):
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/src_dags_fetch_and_ingest.py", line 115, in fetch_and_ingest
manifest: ManifestRequest = data_processor.apply(
^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/data_processor.py", line 53, in apply
processed_data_records = self.clean_records.data_processing(records, csre, csdj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/implementation/clean_records.py", line 100, in data_processing
clean_record = self._iterate_record(
^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/implementation/clean_records.py", line 172, in _iterate_record
processed_data_record[key] = self._iterate_record(
^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/implementation/clean_records.py", line 208, in _iterate_record
replaced_data_partition_value = self.reference_mapping.identify_and_mapping_required_reference_values(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/implementation/reference_mapping.py", line 84, in identify_and_mapping_required_reference_values
replaced_operator_ref_id: str = self._reference_mapping_search_ref_values(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_processor/implementation/reference_mapping.py", line 114, in _reference_mapping_search_ref_values
self.data_fetcher.get_reference_mapping(csre_id, source_ref_value)
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/osdu_airflow/eds/eds_ingest/data_fetcher/data_fetcher.py", line 233, in get_reference_mapping
ExternalReferenceValueMapping(**reference_mapping)
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - File "/home/airflow/.local/lib/python3.11/site-packages/pydantic/main.py", line 176, in __init__
self.__pydantic_validator__.validate_python(data, self_instance=self)
[2025-04-11, 18:32:31 UTC] {logging_mixin.py:190} WARNING - pydantic_core._pydantic_core.ValidationError: 1 validation error for ExternalReferenceValueMapping
data.SimpleMap.ReferenceValueID
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/string_type
[2025-04-11, 18:32:31 UTC] {taskinstance.py:3313} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 768, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 734, in _execute_callable
return ExecutionCallableRunner(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/operator_helpers.py", line 252, in run
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py", line 424, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 238, in execute
return_value = self.execute_callable()
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 256, in execute_callable
return runner.run(*self.op_args, **self.op_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/operator_helpers.py", line 252, in run
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/airflow/dags/eds_ingestion_dags.zip/src_dags_fetch_ingest_scheduler_dag.py", line 27, in _ingest
if Constant.MESSAGE in status:
^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable