Home issueshttps://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues2023-07-05T10:09:40Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/48Parsers/converters code organization porposal2023-07-05T10:09:40ZSiarhei Khaletski (EPAM)Parsers/converters code organization porposal## Rationale
Now the number of different parsers was presented to the OSDU. The parsers/converters were implemented with using of different technologies and programming languages(C++, Java, Python, etc.).
It can cause difficulties duri...## Rationale
Now the number of different parsers was presented to the OSDU. The parsers/converters were implemented with using of different technologies and programming languages(C++, Java, Python, etc.).
It can cause difficulties during onboarding such parsers: requirements, code organization, runtime environment setup.
## Objective
Approve or develop a unified approach regarding to the parsers/converters representation and usage as Airflow DAGs’ operators.
## Proposal
The intention to use as much as possible containerized DAG’s steps, i.e. to use KubernetesPodOperator was mentioned as one of the best practices for Manifest-based Ingestion pipelines.
It means that the pipeline step can be implemented on absolutely different technology and executable part of the step will be executed inside Docker container.
The proposal is to deliver with a parser code a properly configured base Dockerfile. This docker file will contain only required dependencies to run the parser with ability to extend or configure the executable invocation (parameters, environment variables etc.)
Each CSP provider, if needed, should develop own Dockerfile with additional requirements or environment variables setup.
![Parsers_dependencies](/uploads/d24ee18309b2471c94bab6023422807d/Parsers_dependencies.png)
## Implementation
The proposal's implementation example - [WITSML parser](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics-osdu-integration/-/tree/master/build)
### Note
For lightweight DAG’s dependencies (local dependencies) the [Packaged DAGs](https://community.opengroup.org/osdu/platform/data-flow/home/-/issues/47) approach can be used.Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/46Unit of Measure normalization2022-06-29T11:56:36ZDebasis ChatterjeeUnit of Measure normalizationFor an Operator with data from all over the world, may have mix of unit systems. Such as KB elevation of one well is in feet whereas KB elevation for another well is in meters. Is "on the fly" unit conversion in scope? Also, I do not bel...For an Operator with data from all over the world, may have mix of unit systems. Such as KB elevation of one well is in feet whereas KB elevation for another well is in meters. Is "on the fly" unit conversion in scope? Also, I do not believe there is provision to hold two sets of values for such fields (Well's KB elevation, Well Log depth interval values). Hence I am curious, what is the recommended approach for this?
Linked to issue #36https://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/49ADR: Using Postman collection in Platform Validation project for Testing Dags...2022-06-14T05:15:50Zharshit aggarwalADR: Using Postman collection in Platform Validation project for Testing Dags in pipelines# Decision Title
Using Postman collection in Platform Validation project for Testing Dags in pipelines
## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
Currently there aren't an...# Decision Title
Using Postman collection in Platform Validation project for Testing Dags in pipelines
## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
Currently there aren't any Integration Tests for DAGs as part of automated pipelines to validate a DAG deployment in an End to End manner. There are tests present for DAGs like [CSV Parser](https://community.opengroup.org/osdu/platform/data-flow/ingestion/csv-parser/csv-parser/-/tree/master/testing) but they are more of Unit Tests and are not testing the flow with components like Ingestion Workflow Service and Airflow. Hence even after pipelines are successfully executed there is no surety whether a DAG is in functional state. An End to End test for a DAG should comprise of following steps
* Set up (Creating/Uploading any prerequisite data required for the DAG execution)
* Creating a Workflow via Workflow Service
* Triggering the DAG via Workflow Service
* Searching for the records created via Search Service
## Proposal
The proposal here is to leverage existing [Postman collections](https://community.opengroup.org/osdu/platform/testing/-/tree/master/Postman%20Collection) in platform validation project for DAG validation, these collections contains an exhaustive set of requests covering the entire flow as described above. Reusing these collections prevents the effort of writing Conventional Integration Tests as in OSDU services for every DAG. These collections can be executed as a containerized task using [Newman utility](https://learning.postman.com/docs/running-collections/using-newman-cli/command-line-integration-with-newman/) in the Test Stage of the pipelines
## Consequences
The existing postman collection might require few minor changes for them to be fully reusable in Automated pipelines
Note: From the observation so far removing any hardcoded variables and instead using Environment variables is expected to be the only required change
## Rationale
Enabling a framework for End to End test will make the system more robust and any potential bugs will be caught right at the time of contributions itself.https://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/32Validate and Porting Ingestion Initial Framework to multiple Clouds2021-01-28T13:34:44ZStephen Whitley (Invited Expert)Validate and Porting Ingestion Initial Framework to multiple CloudsKateryna Kurach (EPAM)Kateryna Kurach (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/home/-/issues/31Multiple CSPs working with Ingestion Projects2020-09-24T12:09:59ZStephen Whitley (Invited Expert)Multiple CSPs working with Ingestion Projects