Manifest Ingestion DAG merge requestshttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests2023-08-18T11:14:58Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/16ibm-changes2023-08-18T11:14:58ZShrikant Gargibm-changes@wladmirf @jingdongsun @ethiraj@wladmirf @jingdongsun @ethirajM3 - Release 0.5Anuj GuptaAnuj Guptahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/19Fixed issue in environment variable2023-08-18T11:14:57ZKishore BattulaFixed issue in environment variable## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Fixed issue in reading wrong environment variable.M4 - Release 0.7https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/20Ibm changes2023-08-18T11:14:55ZShrikant GargIbm changes## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- No
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [x] IB...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- No
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [x] IBM
## Updates description?
Describe your code changes in details for reviewers (links on Gitlab issues, etc.)
bug fix in ibm codeM4 - Release 0.7Anuj GuptaAnuj Guptahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/21GONRG-1749 add manual button in pipeline for stage: deploy2023-08-18T11:14:53ZMykola Zamkovyi (EPAM)GONRG-1749 add manual button in pipeline for stage: deploy## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
Was added a manual button in the pipeline for the stage: deployM4 - Release 0.7https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/22Azure credentials ut2023-08-18T11:14:51ZKishore BattulaAzure credentials ut## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Added Unit tests for azure_credentials.pyM4 - Release 0.7https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/24Ingestion updates2023-08-18T11:14:49ZSiarhei Khaletski (EPAM)Ingestion updates## Type of change
- [x] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x]...## Type of change
- [x] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x] IBM
## Updates description?
This MR comes with a batch of updates:
Features:
- Validate entire manifest entity (GONRG-1783)
- Surrogate keys replacement (GONRG-1652)
- Auth logic uses Python SDK implementation (GONRG-1689)
- New operator for manifest integrity (GONRG-1700)
- New operator for schema validation (GONRG-1770)
- Logic for ensuring Datasets, WPCs and WP referencial integrity (GONRG-1653)
- Implementation of the batch uploading (GONRG-1650)
- Added FileSource validation for `Datasets` (GONRG-1651)
Structure updates:
- Removed obsolete dags (GONRG-1567)
- README.md has been updated (GONRG-1591)
- Fix `id` composing (GONRG-1700)
- Cleaned-up and renamed airflow variables (GONRG-1719)
Bugfixes:
- Handle file variable fix
- Fix download_as_bytes not supported in storage==1.13.2
- Removed `:` (colon) symbol from the end of reference ids (GONRG-1911)M4 - Release 0.7Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/26Fix for IDs refs. Updated Docker image link2023-08-18T11:14:47ZSiarhei Khaletski (EPAM)Fix for IDs refs. Updated Docker image link## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
- Changed Docker image link
- Added validation reference IDs with versions (GONRG-1910)M4 - Release 0.7Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/27GONRG-1866: add condition statement to import vendors dependencies2023-08-18T11:14:45ZSiarhei Khaletski (EPAM)GONRG-1866: add condition statement to import vendors dependencies## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [x] Azure
- [x] GCP
- [x] IBM
## Updates description?
- Added dynamic modules loading. Now there is no need to install all requirements of each vendor.M4 - Release 0.7Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/28Ingestion updates m4 (tested)2023-08-18T11:14:44ZSiarhei Khaletski (EPAM)Ingestion updates m4 (tested)## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Latest updates/fixes for M4 tag release.
Manifest ingestion was manually tested against testing Postman collectionM4 - Release 0.7Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/29Added readme for azure2023-08-18T11:14:42ZKishore BattulaAdded readme for azure## Type of change
- [] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ] I...## Type of change
- [] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [X] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Updated documentation on how to deploy manifest ingestion DAGs into airflowM5 - Release 0.8https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/30Refactor referential integrity (GONRG-1932)2023-08-18T11:14:40ZSiarhei Khaletski (EPAM)Refactor referential integrity (GONRG-1932)## Type of change
- [ ] Bug Fix
- [ ] Feature
- [x] Refactoring
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azur...## Type of change
- [ ] Bug Fix
- [ ] Feature
- [x] Refactoring
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- Manifest integrity ensuring logic has been refactored.M5 - Release 0.8Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/32Fix for FileSource; fix for pipelines with invalid manifest2023-08-18T11:14:39ZSiarhei Khaletski (EPAM)Fix for FileSource; fix for pipelines with invalid manifest## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [x] GCP
- [ ] IBM
## Updates description?
- Fix for issue with FileSource as a string with spaces
- Fix for with failing pipeline for generally invalid manifestM5 - Release 0.8Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/34Added BaseTokenRefresher2023-08-18T11:14:37ZSiarhei Khaletski (EPAM)Added BaseTokenRefresher## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- Added BaseTokenRefresherM5 - Release 0.8Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/36GONRG-2008: Documentation has been updated2023-08-18T11:14:36ZSiarhei Khaletski (EPAM)GONRG-2008: Documentation has been updated## Type of change
- [ ] Bug Fix
- [ ] Feature
- [x] Documentation
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Az...## Type of change
- [ ] Bug Fix
- [ ] Feature
- [x] Documentation
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- Documentation has been updatedM5 - Release 0.8Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/37GONRG-2185: Single manifest validation hidden under the flag2023-08-18T11:14:34ZSiarhei Khaletski (EPAM)GONRG-2185: Single manifest validation hidden under the flag## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- Single manifest validation hidden under the flagM5 - Release 0.8Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/40Aws impl, id version bug fix2023-08-18T11:14:32ZSpencer Suttonsuttonsp@amazon.comAws impl, id version bug fix## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [x] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Aws implementation code.
Also bug fix that addresses this issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/issues/55M5 - Release 0.8Spencer Suttonsuttonsp@amazon.comSpencer Suttonsuttonsp@amazon.comhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/44Ingestion updates2023-08-18T11:14:31ZYan Sushchynski (EPAM)Ingestion updates## Type of change
- [x] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
This MR comes with a few updates:
Features:
* Add report about skipped and processed ids to XComs (Issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/issues/35). (GONRG-1934)
Bugfixes:
* Fix the issue, when an integer part of ids was considered a version and this prevented WP manifest ingestion. (Isue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/issues/55). (GONRG-2144)
* Fix the issue, when references to ingested entities with real ids get extra ":" (e.g. `"Datasets": ["osdu:dataset--File.Generic:feb02::"]` instead of `"Datasets": [ "osdu:dataset--File.Generic:feb02:"]`) (Issue: https://gitlab.opengroup.org/osdu/subcommittees/ea/projects/pre-shipping/home/-/issues/142). (GONRG-2147)M5 - Release 0.8Siarhei Khaletski (EPAM)Rostislav Dublin (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/46GONRG-2142: Add lru cache2023-08-18T11:14:29ZSiarhei Khaletski (EPAM)GONRG-2142: Add lru cache## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] ...## Type of change
- [ ] Bug Fix
- [x] Feature
## Does this introduce a change in the core logic?
- [No]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- Added lru-cache for `get_schema` method.M6 - Release 0.9Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/47GONRG-2170: Added cursor for search service requests2023-08-18T11:14:27ZSiarhei Khaletski (EPAM)GONRG-2170: Added cursor for search service requests## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
- The fix fixes an issue with limit of Search query requests (closes https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/35)M6 - Release 0.9Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/merge_requests/48GONRG-2292: Find references with no version in Search2023-08-18T11:14:25ZYan Sushchynski (EPAM)GONRG-2292: Find references with no version in Search## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ]...## Type of change
- [x] Bug Fix
- [ ] Feature
## Does this introduce a change in the core logic?
- [Yes]
## Does this introduce a change in the cloud provider implementation, if so which cloud?
- [ ] AWS
- [ ] Azure
- [ ] GCP
- [ ] IBM
## Updates description?
Fix the issue:
- An attempt to ingest the entity fails, if it contains references with **no specific version** to already ingested data on OSDU .M6 - Release 0.9