WITSML Parser issueshttps://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/issues2023-04-04T10:49:00Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/issues/64Refactor DAG related code2023-04-04T10:49:00ZYan Sushchynski (EPAM)Refactor DAG related code### Introduction
There is DAG related code that is executed in the container during a DAG run. The code is [here](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src...### Introduction
There is DAG related code that is executed in the container during a DAG run. The code is [here](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src/witsml_parser/main.py) and [here](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src/witsml_parser/energistics/libs/create_energistics_manifest.py). And this code looks messy and outdated, and requires some refactoring.
### What should be done?
1. Update the code to make it work with the most recent `osdu-*` Python libs. The dependencies are here https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/build/requirements.txt
2. Delete deprecated functionality of processing files by `preload_file_path` [here](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src/witsml_parser/energistics/libs/create_energistics_manifest.py#L314).
3. Add the static-analysis step in the CI/CD.
4. Add possibility to pass the user's access/id token to the DAG
5. Common refactoring, because the code is messy now (a lot of "ifs" and lines of code in a single function)M17 - Release 0.20Vadzim Kulybaharshit aggarwalWalter Detienne peyssonMarc Burnie [AWS]Vadzim Kulybahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/issues/62WITSML Parser - SchemaFormatType needs to be updated2023-05-02T20:56:59ZChad LeongWITSML Parser - SchemaFormatType needs to be updatedThe reference data for Energistics SchemaFormatType has been updated in the data definition https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/ReferenceValues/Manifests/reference-data/OPEN/SchemaFormatType.1.0.0.jso...The reference data for Energistics SchemaFormatType has been updated in the data definition https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/ReferenceValues/Manifests/reference-data/OPEN/SchemaFormatType.1.0.0.json to reflect the different WITSML version.
Problem:
The WITSML parser creates a manifest after parsing using the hardcoded value:
https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src/witsml_parser/energistics/libs/energistics_parsers/parser.py#L446
It needs to be updated to reflect the changes in the data definition.
`osdu:reference-data--SchemaFormatType:EnergisticsWITSML`
to
`osdu:reference-data--SchemaFormatType:Energistics.WITSML.v1.4`https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/issues/61WITSML Parser - Well Trajectory Failure2022-08-31T11:19:18ZVadzim KulybaWITSML Parser - Well Trajectory Failure```
[2022-08-29, 20:47:17 UTC] {validate_schema.py:322} ERROR - Schema validation error. Data field.
[2022-08-29, 20:47:17 UTC] {validate_schema.py:323} ERROR - Manifest kind: opendes:wks:work-product-component--WellboreTrajectory:1.1.0
...```
[2022-08-29, 20:47:17 UTC] {validate_schema.py:322} ERROR - Schema validation error. Data field.
[2022-08-29, 20:47:17 UTC] {validate_schema.py:323} ERROR - Manifest kind: opendes:wks:work-product-component--WellboreTrajectory:1.1.0
[2022-08-29, 20:47:17 UTC] {validate_schema.py:324} ERROR - Error: 'Azi' does not match '^[\\w\\-\\.]+:reference-data\\-\\-TrajectoryStationPropertyType:[\\w\\-\\.\\:\\%]+:[0-9]*$'
Failed validating 'pattern' in schema['properties']['data']['allOf'][3]['properties']['AvailableTrajectoryStationProperties']['items']['properties']['TrajectoryStationPropertyTypeID']:
{'description': 'The reference to a trajectory station property type - '
'of if interpreted as channels, the curve or channel '
'name type, identifying e.g. MD, Inclination, Azimuth. '
'This is a relationship to a '
'reference-data--TrajectoryStationPropertyType record '
'id.',
'example': 'partition-id:reference-data--TrajectoryStationPropertyType:AzimuthTN:',
'pattern': '^[\\w\\-\\.]+:reference-data\\-\\-TrajectoryStationPropertyType:[\\w\\-\\.\\:\\%]+:[0-9]*$',
'title': 'Trajectory Station Property Type ID',
'type': 'string',
'x-osdu-relationship': [{'EntityType': 'TrajectoryStationPropertyType',
'GroupType': 'reference-data'}]}
On instance['data']['AvailableTrajectoryStationProperties'][0]['TrajectoryStationPropertyTypeID']:
'Azi'
```
This is error log from azure DEMO validate_manifest_schema_task, but it is common code issue, because it is repro on gcp (cc @Yan_Sushchynski)
I think the main issue inside parser in this line:
https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/blob/master/energistics/src/witsml_parser/energistics/libs/energistics_parsers/witsml_2_0/trajectory_parser.py#L117
Because `tagname` didn't match this schema pattern (cc @epeysson)M14 - Release 0.17https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics/witsml-parser/-/issues/58WITSML Parser is failing with Tubular data2022-06-30T20:14:04ZDebasis ChatterjeeWITSML Parser is failing with Tubular dataTested in Azure R3M11 Preship environment.
I have experienced a failure.
Data file.
[Tubular__witsml-DC.xml](/uploads/9f7618f3cf1825573343cc7a39e6a2bd/Tubular__witsml-DC.xml)
Log
[M11_Azure_WITSML-Tubular-Debasis.txt](/uploads/11ae3e2...Tested in Azure R3M11 Preship environment.
I have experienced a failure.
Data file.
[Tubular__witsml-DC.xml](/uploads/9f7618f3cf1825573343cc7a39e6a2bd/Tubular__witsml-DC.xml)
Log
[M11_Azure_WITSML-Tubular-Debasis.txt](/uploads/11ae3e249b9ddd2bc3b1b62ae902904c/M11_Azure_WITSML-Tubular-Debasis.txt)
cc - @todaiks for informationetienne peyssonetienne peysson