Manifest Ingestion DAG merge requestshttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests2020-09-28T14:00:10Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/1Publish manifest-based ingestion workflow (GONRG-568)2020-09-28T14:00:10ZDmitriy RudkoPublish manifest-based ingestion workflow (GONRG-568)MR Contains:
- #5: Implement custom status-tracking Airflow operator
- #6: Implement Integration tests for Ingestion DAGs
**MR highlights:**
1. MR define base project structure
2. Introduce separation between DAGs and reusable [Airflow...MR Contains:
- #5: Implement custom status-tracking Airflow operator
- #6: Implement Integration tests for Ingestion DAGs
**MR highlights:**
1. MR define base project structure
2. Introduce separation between DAGs and reusable [Airflow components](https://airflow.apache.org/docs/stable/howto/custom-operator.html):
- Operators
- Hooks
- Sensors
3. Introduce base-line for Integration tests
4. Introduce base-line for unit testsM1 - Release 0.1ethiraj krishnamanaiduJoeBrandt BealDaniel SchollAlan Brazethiraj krishnamanaiduhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/15Cherry-pick of "GONRG-1799: Added Azure requirements" to release/0.52021-02-11T20:43:41ZDavid Diederichd.diederich@opengroup.orgCherry-pick of "GONRG-1799: Added Azure requirements" to release/0.5This merges a fix into the release branch, from !14, which will correct the failing unit_tests job.This merges a fix into the release branch, from !14, which will correct the failing unit_tests job.David Diederichd.diederich@opengroup.orgDavid Diederichd.diederich@opengroup.orghttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/17Cherry-pick 'ibm-changes' to release/0.52021-02-12T22:20:54ZDavid Diederichd.diederich@opengroup.orgCherry-pick 'ibm-changes' to release/0.5This merges !16 into the release/0.5 branch to be included as a new patchThis merges !16 into the release/0.5 branch to be included as a new patchDavid Diederichd.diederich@opengroup.orgDavid Diederichd.diederich@opengroup.orghttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/18Fix bugs in FileHandler and GoogleCloudStorageClient2021-02-18T16:53:53ZYan Sushchynski (EPAM)Fix bugs in FileHandler and GoogleCloudStorageClient## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## What is the current behavior?
- Fixed bug connected with variables' names in `_get_file_from_preload_path` of FileHandler class.
- Changed `blob.download_as_bytes` to `blob.download_as_string` in `GoogleCloudStorageClient`'s method `_get_file_as_bytes_from_bucket`, because Blob class doesn't support `download_as_bytes` in Composer's version (==1.13.2) of `google.cloud.storage`.Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/31GONRG-2002: added variables for PRE_PROD deployment2021-03-17T15:50:24ZSiarhei Khaletski (EPAM)GONRG-2002: added variables for PRE_PROD deployment## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [X] GCP
- [ ] ...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [X] GCP
- [ ] IBM
## Updates description?
Changes for per-prod deployment for GCPSiarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/58GONRG-2327: Added possibility to specify custom SA-file for credentials logic2021-06-24T13:33:57ZSiarhei Khaletski (EPAM)GONRG-2327: Added possibility to specify custom SA-file for credentials logic## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Added possibility to specify custom SA file path for GCP credentials.M7 - Release 0.10Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/62GONRG-2726: Move libs to SDK2021-07-20T15:27:50ZYan Sushchynski (EPAM)GONRG-2726: Move libs to SDK## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Move folders `libs` and `providers` to PythonSDK. The SDK must be installed via `pip` to Airflow's env.
Now, to access to code in these folders
`import osdu_api.libs`
`import osdu_api.providers`M7 - Release 0.10Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/66GONRG-2696: Manifest integrity batch search2021-08-23T15:57:34ZYan Sushchynski (EPAM)GONRG-2696: Manifest integrity batch search## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Add possibility to get the list of already skipped entities from previous tasks to use them in Manifest Integrity Check.M8 - Release 0.11Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/67Enable Support for Packaged DAGs2021-08-26T11:42:24Zharshit aggarwalEnable Support for Packaged DAGs## Type of change
- [ ] Bug Fix
- [X] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [X] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
The MR here is making changes to support packaged DAGs for manifest, the [ADR](https://community.opengroup.org/osdu/platform/data-flow/home/-/issues/47) for this change has been approved
**New folder structure**
```
├── osdu_manifest
│ ├── __init__.py
│ ├── libs
│ │ ├── __init__.py
│ │ └── utils.py
│ └── operators
│ | ├── __init__.py
│ | └── customOperator1.py
| |___ hooks
| | |__ __init__.py
| |
| |___ configs
| |__ __init__.py
|
|___ osdu-ingest-r3.py
```
The changes include in the MR include
- Restructuring the folders
- Fixing any import statements
- Minor changes to run existing tests
**Related Issue - https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/issues/86**
**ADR - https://community.opengroup.org/osdu/platform/data-flow/home/-/issues/47**M8 - Release 0.11harshit aggarwalharshit aggarwalhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/75GONRG-3107: Change backward compatibility2021-09-10T15:14:21ZYan Sushchynski (EPAM)GONRG-3107: Change backward compatibility## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x] IBM
## Updates description?
The DAG depends on [osdu-airflow-lib](https://community.opengroup.org/osdu/platform/data-flow/ingestion/osdu-airflow-lib) now.
Before deploying the DAG install the package
```shell
pip install 'osdu-airflow' --extra-index-url=https://community.opengroup.org/api/v4/projects/668/packages/pypi/simple
```M9 - Release 0.12Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/76GONRG-3109: Move common logic to osdu-airflow-lib2021-09-13T08:38:35ZAleksandr Spivakov (EPAM)GONRG-3109: Move common logic to osdu-airflow-lib## Type of change
- [ ] Bug Fix
- [X] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [X] AWS
- [X] Azure
- [X] GCP
- [X]...## Type of change
- [ ] Bug Fix
- [X] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [X] AWS
- [X] Azure
- [X] GCP
- [X] IBM
## Updates description?
Move common logic to osdu-airflow-lib
Closes #89M9 - Release 0.12Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/79GONRG-3300: Add E2E tests [GCP]2021-10-11T08:57:49ZYan Sushchynski (EPAM)GONRG-3300: Add E2E tests [GCP]## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Add E2E tests for Ingestion DAGsM9 - Release 0.12Siarhei Khaletski (EPAM)Aleksandr Spivakov (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/78GONRG-3452: Move Ingestion Logic from SDK2021-10-12T15:12:21ZYan Sushchynski (EPAM)GONRG-3452: Move Ingestion Logic from SDK## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x] IBM
## Updates description?
Move Ingestion Logic from Python SDK into separate package.M9 - Release 0.12Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/81Regular expressions for the release branch are incorrect2021-10-26T04:47:36ZDavid Diederichd.diederich@opengroup.orgRegular expressions for the release branch are incorrect## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- No
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IB...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- No
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
This fixes the `.gitlab-ci.yml` branch expressions. Without this, they are ignored, leading to invalid YAML errors any time a branch named `release/*` is created.M9 - Release 0.12David Diederichd.diederich@opengroup.orgDavid Diederichd.diederich@opengroup.orghttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/82GONRG-3605: Update deps version to 0.12.02021-10-27T12:45:52ZYan Sushchynski (EPAM)GONRG-3605: Update deps version to 0.12.0## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Update requirements.txt file for Airflow environment to release onesM9 - Release 0.12Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/83Merge branch 'GONRG-3605_Update_dags_deps_for_release' into 'master'2021-10-27T13:52:46ZSiarhei Khaletski (EPAM)Merge branch 'GONRG-3605_Update_dags_deps_for_release' into 'master'## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Describe your code changes in details for reviewers (links on Gitlab issues, etc.)M9 - Release 0.12Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/84GONRG-3709 fix E2E tests2021-11-03T08:46:09ZAleksandr Spivakov (EPAM)GONRG-3709 fix E2E tests## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] ...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
fix E2E tests for GCP:
- add support for tags in CI steps
- update required services URLs for Community test environmentM10 - Release 0.13Siarhei Khaletski (EPAM)Yan Sushchynski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/85GONRG-3733 remove airflow v1 deployment steps2021-11-11T09:57:52ZAleksandr Spivakov (EPAM)GONRG-3733 remove airflow v1 deployment steps## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Describe your code changes in details for reviewers (links on Gitlab issues, etc.)M10 - Release 0.13Siarhei Khaletski (EPAM)Yan Sushchynski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/86Remove unused dags2021-11-15T05:31:18Zharshit aggarwalRemove unused dagsharshit aggarwalharshit aggarwalhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/88GONRG-3779: Common pipeline for dag2021-12-22T18:57:02ZYan Sushchynski (EPAM)GONRG-3779: Common pipeline for dag## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Create common pipelines for GCP.M10 - Release 0.13Vladislav Shishko (EPAM)Vladislav Shishko (EPAM)