Azure M18 - Osdu_ingest Airflow DAG Unable to init whitelist reference patterns
The issue was found while testing M18 on Azure but it is probably not Azure M18 specific.
I have noticed that the Osdu_ingest DAG is generating errors during the "provide_manifest_integrity_task". The error does not seem to be related to an invalid manifest but rather an issue in the python script of the DAG.
Here is the error message
[2023-06-13, 19:11:23 UTC] {manifest_analyzer.py:108} ERROR - Unable to init whitelist reference patterns: ['(?P<key>\\"dataset--ConnectedSource.Generic\\":)\\s?\\[?\\s*\\"(?P<value>[\\s\\w\\dataset--ConnectedSource.Generic:-]*:)\\"\\s*\\]?']
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/osdu_ingestion/libs/manifest_analyzer.py", line 103, in _compile_whitelist_ref_patterns
return [
File "/home/airflow/.local/lib/python3.8/site-packages/osdu_ingestion/libs/manifest_analyzer.py", line 104, in <listcomp>
re.compile(r"{}".format(pattern), re.I + re.M)
File "/usr/local/lib/python3.8/re.py", line 252, in compile
return _compile(pattern, flags)
File "/usr/local/lib/python3.8/re.py", line 304, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/local/lib/python3.8/sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "/usr/local/lib/python3.8/sre_parse.py", line 948, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
File "/usr/local/lib/python3.8/sre_parse.py", line 443, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
File "/usr/local/lib/python3.8/sre_parse.py", line 834, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
File "/usr/local/lib/python3.8/sre_parse.py", line 443, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
File "/usr/local/lib/python3.8/sre_parse.py", line 598, in _parse
raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range t-- at position 79
Here is an example of an Job with this error.
Edited by Fabien Bosquet