OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2023-08-02T22:19:08Zhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/315[Feature] Airflow2 stage with private endpoints2023-08-02T22:19:08ZArturo Hernandez [EPAM][Feature] Airflow2 stage with private endpoints# Airflow2 stage
---
By default airflow2 it is deployed at service resources stage, one airflow it is configured for the hole OSDU.
It looks like airflow2 it is not enough for service resources in a multipartition environment, therefor...# Airflow2 stage
---
By default airflow2 it is deployed at service resources stage, one airflow it is configured for the hole OSDU.
It looks like airflow2 it is not enough for service resources in a multipartition environment, therefore, airflow2 it is deployed externally per data partition, in a separated network and subnet (brand new airflow2 resources will be created).
In order to secure and increase performance when using an external airflow2, we need to setup private endpoint for those resources, including private endpoint for the partition-airflow2 application gateway from the main AKS cluster.
Airflow2 interacts with the storage accounts mostly, therefore I guess those private endpoints will be needed as well.
## Airflow2 independent stage
---
I have an strong opinion that airflow 2 it should be segregated from the partition resources, in case there is need for new external airflow, that should be created as separate stage (like data-partition, service, central), should be some "airflow" resources stage, which should provide configured airflow out of the box.
## ADF Replacement
---
To achieve convergence between ADME and community we might want to start thinking about Azure Data Factory, which it is already available for [terraform - AzureRM](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/data_factory.html), we might start smoothly this migration at data partition level and then at service resource level.
## Action items
---
@lucynliu @nursheikh I would like to start these discussion here in the forum, it would be nice to start convergence between community and ADME, I have the feeling we should get rid of the per-partition Airflow2 resources (including AKS for airflow), and start considering at this first stage to use ADF per partition as optional feature, then move forward with ADF at Service Resources, or if should be fine now to start considering using ADF per partition (I don't know if this it is really convenient).
We should also include the optional feature of private endpoints from AKS to ADF/AKS-Airflow in any case.
cc. @lucynliu @vleskivArturo Hernandez [EPAM]Arturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/284SPIKE: Data - Investigate options for incremental updates2023-08-02T15:39:57ZNoel OkanyaSPIKE: Data - Investigate options for incremental updatesCache updates should take into account what records have been loaded since the last update, and then attempt to add only those new records instead of doing a complete destroy-rebuild of the Ignite cache.
Important details:
- What proper...Cache updates should take into account what records have been loaded since the last update, and then attempt to add only those new records instead of doing a complete destroy-rebuild of the Ignite cache.
Important details:
- What property on a Search API record dictates the last time the record was modified or added?
- Need to enforce that new records abide by the schema currently set for the Ignite Cache. New records that break the Ignite schema will not be added. Schema cannot be modified once it is in place.https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/issues/154Workflow Run API - requires datapartitionId in body as well as header2023-10-26T12:23:43ZSurabhi SethWorkflow Run API - requires datapartitionId in body as well as header
API: Workflow Service API \> Workflow Run /workflow/{workflow_name}/workflowRun
This service takes data-partition-id as part of the headers as well as payload body { "executionContext": { "id": "string", \*\* "dataPartitionId": "string...
API: Workflow Service API \> Workflow Run /workflow/{workflow_name}/workflowRun
This service takes data-partition-id as part of the headers as well as payload body { "executionContext": { "id": "string", \*\* "dataPartitionId": "string"\*\* }, "runId": "string" }
![MicrosoftTeams-image__5\<span data-escaped-char\>\_\</span\>](/uploads/5e8d61cdc1316019ab905597094525b9/MicrosoftTeams-image__5_.png)Issue: Requesting for dataPartitionId in the payload body is redundant, and inconsistent with the implementation of all other OSDU API's (where data-partition-id is used from the header)
Ref: https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/blob/master/docs/api/openapi.workflow.yaml?plain=0Chad LeongDeepa KumariChad Leonghttps://community.opengroup.org/osdu/platform/pre-shipping/-/issues/558IBM M19 Policy engine incorrect settings2023-08-29T10:55:12ZDadong ZhouIBM M19 Policy engine incorrect settingsKamlesh reported the errors when calling the Policy Evaluate api:
```
{
"result": {
"records": [
{
"errors": [
{
"code": 404,
"id...Kamlesh reported the errors when calling the Policy Evaluate api:
```
{
"result": {
"records": [
{
"errors": [
{
"code": 404,
"id": "opendes:master-data--Well:test1111111111",
"message": "Entitlements response 404 Error 404 - Not Found ",
"reason": "Unauthorized"
},
{
"code": 404,
"id": "opendes:master-data--Well:test1111111111",
"message": "Legal response 404 Error 404 - Not Found ",
"reason": "Error from compliance service"
},
{
"code": "403",
"id": "opendes:master-data--Well:test1111111111",
"message": "The user is not authorized to perform this action",
"reason": "Access denied"
}
],
"id": "opendes:master-data--Well:test1111111111"
}
]
}
}
```
I confirmed the error in IBM M19 environment.
From the error messages, it seems the two environmental variables ENTITLEMENTS_BASE_URL and LEGAL_BASE_URL in Policy engine are not set correctly. Please check and correct.
Thanks.
cc @todaiksAnuj Guptavikas ranaAnuj Guptahttps://community.opengroup.org/osdu/platform/system/reference/crs-conversion-service/-/issues/76convertTrajectory API returns NaN value for the input request.2024-03-13T17:20:52ZKIRAN ALLAMSETYconvertTrajectory API returns NaN value for the input request.convertTrajectory API , which internally calls the convert API and the error is thrown from the convert API , and as this is not handled , so the WGS coordinates for x and y are set “NaN” in the convertTrajectory Response.
Request:
{
...convertTrajectory API , which internally calls the convert API and the error is thrown from the convert API , and as this is not handled , so the WGS coordinates for x and y are set “NaN” in the convertTrajectory Response.
Request:
{
"azimuthReference": "GN",
"interpolate": false,
"referencePoint": {
"x": 400000,
"y": 6500000,
"z": 100
},
"unitZ": "osdu:reference-data--UnitOfMeasure:ft:",
"unitMD": "osdu:reference-data--UnitOfMeasure:m:",
"inputStations": [
{
"md": 0,
"inclination": 0,
"azimuth": 20
},
{
"md": 100,
"inclination": 10,
"azimuth": 40
}
],
"trajectoryCRS": "osdu:reference-data--CoordinateReferenceSystem:Projected:EPSG::32066:",
"inputKind": "MD_Incl_Azim",
"method": "AzimuthalEquidistant"
}
Response:
{
"trajectoryCRS": "{\"authCode\":{\"auth\":\"EPSG\",\"code\":\"32066\"},\"name\":\"NAD_1927_BLM_Zone_16N\",\"type\":\"LBC\",\"ver\":\"PE_10_9_1\",\"wkt\":\"PROJCS[\\\"NAD_1927_BLM_Zone_16N\\\",GEOGCS[\\\"GCS_North_American_1927\\\",DATUM[\\\"D_North_American_1927\\\",SPHEROID[\\\"Clarke_1866\\\",6378206.4,294.9786982]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433]],PROJECTION[\\\"Transverse_Mercator\\\"],PARAMETER[\\\"False_Easting\\\",1640416.666666667],PARAMETER[\\\"False_Northing\\\",0.0],PARAMETER[\\\"Central_Meridian\\\",-87.0],PARAMETER[\\\"Scale_Factor\\\",0.9996],PARAMETER[\\\"Latitude_Of_Origin\\\",0.0],UNIT[\\\"Foot_US\\\",0.3048006096012192],AUTHORITY[\\\"EPSG\\\",32066]]\"}",
"unitXY": "{\"abcd\":{\"a\":0.0,\"b\":1200.0,\"c\":3937.0,\"d\":0.0},\"symbol\":\"ft[US]\",\"baseMeasurement\":{\"ancestry\":\"L\",\"type\":\"UM\"},\"type\":\"UAD\"}",
"unitZ": "{\"abcd\":{\"a\":0.0,\"b\":0.3048,\"c\":1.0,\"d\":0.0},\"symbol\":\"ft\",\"baseMeasurement\":{\"ancestry\":\"L\",\"type\":\"UM\"},\"type\":\"UAD\"}",
"unitDls": "{\"scaleOffset\":{\"scale\":5.72614583987641E-4,\"offset\":0.0},\"symbol\":\"deg/100ft\",\"baseMeasurement\":{\"ancestry\":\"Rotation_Per_Length\",\"type\":\"UM\"},\"type\":\"USO\"}",
"stations": [
{
"md": 0.0,
"inclination": 0.0,
"azimuthTN": 18.903042055778428,
"azimuthGN": 20.0,
"dxTN": 0.0,
"dyTN": 0.0,
"point": {
"x": 121920.24384049278,
"y": 1981203.9624078341,
"z": 99.99999999999999
},
"wgs84Longitude": "NaN",
"wgs84Latitude": "NaN",
"dls": 0.0,
"original": true,
"dz": 0.0
},
{
"md": 100.0,
"inclination": 10.0,
"azimuthTN": 38.90304205577843,
"azimuthGN": 40.0,
"dxTN": 17.934591286244686,
"dyTN": 22.224168064134773,
"point": {
"x": 121925.84664895518,
"y": 1981210.6395745277,
"z": -226.42085631407443
},
"wgs84Longitude": "NaN",
"wgs84Latitude": "NaN",
"dls": 3.048000000000004,
"original": true,
"dz": 326.4208563140744
}
],
"localCRS": "{\"name\":\"Azimuthal Equidistant\",\"type\":\"LBC\",\"ver\":\"PE_10_9_1\",\"wkt\":\"PROJCS[\\\"Azimuthal Equidistant Lng=-90.56722112;Lat=17.88719244\\\",GEOGCS[\\\"GCS_North_American_1927\\\",DATUM[\\\"D_North_American_1927\\\",SPHEROID[\\\"Clarke_1866\\\",6378206.4,294.9786982]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433]],PROJECTION[\\\"Modified Azimuthal_Equidistant\\\"],PARAMETER[\\\"False_Easting\\\",0.0],PARAMETER[\\\"False_Northing\\\",0.0],PARAMETER[\\\"Central_Meridian\\\",-90.56722111685697],PARAMETER[\\\"Latitude_Of_Origin\\\",17.887192439357598],UNIT[\\\"Foot_US\\\",0.3048006096012192]]\"}",
"method": "AzimuthalEquidistant",
"operationsApplied": [
"derived TN from GN azimuth by grid convergence 358.903042",
"unitMD Factor value: 1.0 is used for computation of MD",
"computed deflections via minimum curvature method",
"computation method: AzimuthalEquidistant",
"conversion from 'Azimuthal Equidistant' to 'GCS_North_American_1927'",
"conversion from 'GCS_North_American_1927' to 'NAD_1927_BLM_Zone_16N'"
],
"scaleConvergenceList": [
{
"scalefactor": 1.001368,
"convergence": -1.09696,
"point": {
"x": 121920.24384049278,
"y": 1981203.9624078341,
"z": 99.99999999999999
}
},
{
"scalefactor": 1.001368,
"convergence": -1.09695,
"point": {
"x": 121925.84664895518,
"y": 1981210.6395745277,
"z": -226.42085631407443
}
}
],
"unitMD": "{\"abcd\":{\"a\":0.0,\"b\":1.0,\"c\":1.0,\"d\":0.0},\"symbol\":\"m\",\"baseMeasurement\":{\"ancestry\":\"L\",\"type\":\"UM\"},\"type\":\"UAD\"}",
"inputKind": "MD_Incl_Azim"
}M20 - Release 0.23Puneet BhardwajKIRAN ALLAMSETYPuneet Bhardwajhttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/283Postman - Update automated tests2023-07-27T22:27:12ZLevi RemingtonPostman - Update automated testsThe GCZ Postman Collection features a set of automated tests for light validation on features returned by GCZ Feature Layers. These tests are not applicable to every endpoint within GCZ services, but they still apply - and fail.
The Col...The GCZ Postman Collection features a set of automated tests for light validation on features returned by GCZ Feature Layers. These tests are not applicable to every endpoint within GCZ services, but they still apply - and fail.
The Collection should be updated to run tests more selectively, so that failure may reliably indicate a problem.
Acceptance Criteria:
- Postman Collection can be run in its entirety without failurehttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/282Transformer - Testing Support - Add Partial Record Loading capability for sho...2023-10-11T15:26:24ZLevi RemingtonTransformer - Testing Support - Add Partial Record Loading capability for shorter testing durationsFor the automated JUnit tests, we require a configurable limiter on the getAllRecords function. The goal is to artificially limit the number of records plugged into lengthy workflows so they can be tested 100% without taking hours to com...For the automated JUnit tests, we require a configurable limiter on the getAllRecords function. The goal is to artificially limit the number of records plugged into lengthy workflows so they can be tested 100% without taking hours to complete. This is useful for advanced datatypes which require a great deal of processing.
This is an important task, as the more complex workflows that are added without automated testing support, the lower our test coverage numbers become.
Acceptance criteria:
- getAllRecords function extended with optional parameter for imposing a hard-limit on the max number of features that can be ingested in total.
- Updated function leveraged in all applicable test scenarios.Ankita SrivastavaAnkita Srivastavahttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/281Transformer - Enhance Wellbore Marker Ingestion with dynamic RefPoint2023-07-27T22:15:32ZLevi RemingtonTransformer - Enhance Wellbore Marker Ingestion with dynamic RefPointThe upcoming wellbore marker set ingestion workflow will support dynamic creation of Marker Points by interpolating measured depth along a Trajectory. However, **the current assumption is that a Wellbore Trajectory CSV will possess a Sur...The upcoming wellbore marker set ingestion workflow will support dynamic creation of Marker Points by interpolating measured depth along a Trajectory. However, **the current assumption is that a Wellbore Trajectory CSV will possess a SurfaceX and SurfaceY column** to dictate the Surface Location in the same CRS as the Trajectory CSV.
The goal of this issue is to expand support to CSVs which do not possess these columns. The workflow would be:
If CSV does not possess SurfaceX and SurfaceY...
1. Locate the original Well or Wellbore record through a series of ID references
2. Make a query to the Storage API to get the AsIngestedCoordinates of the surface well.
3. Detect the CRS of the coordinates and convert to the WKID of the Trajectory (if necessary)
4. Plug results into the -x and -y inputs of the interpolation script
Acceptane Criteria:
- Wellbore Markers produced and displayed on a map from MarkerSet record where related Trajectory CSV did not contain SurfaceX or SurfaceY columnshttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/104'path' parameter should be optional not required2023-08-29T15:07:30ZZachary Keirn'path' parameter should be optional not requiredThe API docs all state that 'path' is a required field. It looks like it is actually optional and I am told by Mark Yan that the service inserts a "/" if it is not provided. Current collections for testing this all set the 'path' paramet...The API docs all state that 'path' is a required field. It looks like it is actually optional and I am told by Mark Yan that the service inserts a "/" if it is not provided. Current collections for testing this all set the 'path' parameter to an empty string in a pre-request script. But they could be made more clear by just not entering this parameter at all. ALSO, would like to see example of when setting the path is needed and how it should be set under that circumstance. From reading the API doc, it seems to suggest that you would enter the path to the segy file, but if you do that you get following error that seems to insert slashes at start and end of the provided 'path' parameter: The ‘path’ parameter /sd://osdu/testtenant2/ST0202R08_PS_PSDM_RAW_PP_TIME.MIG_RAW.POST_STACK.3D.JS-017534.segy/ is in a wrong format.It should match the regex expression ^[/A-Za-z0-9_.-]*$. In this case, 'path' was set in the params as "sd://osdu/testtenant2/ST0202R08_PS_PSDM_RAW_PP_TIME.MIG_RAW.POST_STACK.3D.JS-017534.segy"Mark YanMark Yanhttps://community.opengroup.org/osdu/platform/system/register/-/issues/46Use Secret service for storing and fetching subscriber secrets.2023-11-08T12:11:40ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comUse Secret service for storing and fetching subscriber secrets.Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/security-and-compliance/secret/-/issues/6Secret service as a Centralized Solution for Managing Secrets.2024-01-12T12:09:58ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comSecret service as a Centralized Solution for Managing Secrets.# Centralized Solution for Managing Secrets
Date: 2023-09-29
## Status
Proposed
## Context
**Key points**
- There are OSDU Services that do not utilize the Secret Service as a place to keep secrets, making them scattered, and incre...# Centralized Solution for Managing Secrets
Date: 2023-09-29
## Status
Proposed
## Context
**Key points**
- There are OSDU Services that do not utilize the Secret Service as a place to keep secrets, making them scattered, and increasing potential attack surface.
## Decision
**Secret Service V2 API:**
**Key points**
- Use a single centralized service for secrets management
- Register and Partition service refactoring assumed to use Secret service
- Secret service should be improved to comply with all the requirements
![Secrets_management](/uploads/f11372746e43a65f77c7e8f3c8343d23/Secrets_management.png)
## Consequences
**Pros**
- Single, universal, and centralized secret storage - no need to have different implementations for each service (Register, EDS)
- Extendable, cloud- and storage-agnostic approach
- Secure RBAC modelRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/179Storage batch API returns 404 for unauthorized records2024-03-07T13:08:37ZAn NgoStorage batch API returns 404 for unauthorized records**Use-case:** Reindex Kind API is called.
Noted in the logs there were 404s returned.
Record Fetch on some of the impacted records, 403s were returned.
Investigation shows Batch Record fetch returned 404s instead.
Issue identified f...**Use-case:** Reindex Kind API is called.
Noted in the logs there were 404s returned.
Record Fetch on some of the impacted records, 403s were returned.
Investigation shows Batch Record fetch returned 404s instead.
Issue identified from this workflow:
- Storage batch API responds unauthorized records (403) as not found (404)
### ADR: Storage batch API responds unauthorized records (403) as not found (404)
#### Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
#### Context & Scope
The current behavior of Storage batch API: if a record is not authorized, it is put in the _notFound_ field of the response body along with other not found records. The response body in this case looks like this:
```
{
"records": [],
"notFound": [
"opendes:facet:unauthorizedrecord1",
"opendes:facet:unauthorizedrecord2",
//other not found records...
],
"conversionStatuses": []
}
```
#### Solution
To fix this behavior of the Storage batch API we can introduce a new field to the response body. The proposed solution is to add a new field (_unauthorized_) to the response body, so we can distinguish between unauthorized and actual not found records. Sample response body:
```
{
"records": [],
"notFound": [
//not found records...
],
"unauthorized": [
"opendes:facet:unauthorizedrecord1",
"opendes:facet:unauthorizedrecord2"
],
"conversionStatuses": []
}
```
#### Сonsequence
This solution is a breaking change as it implies changing API contract. It will include a change in the core library, a change in Storage, and then a change in the Indexer service to handle batch API response.Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/104Azure Monitor - Policy service logs not found in Azure App Insights2024-01-03T15:29:57ZKelly ZhouAzure Monitor - Policy service logs not found in Azure App InsightsHi,
We found that we can't find any policy service logs in Azure App Insight, is that by design or are we missing any configuration? we wonder if the monitoring of policy service in Azure deployment going well and how OSDU community ma...Hi,
We found that we can't find any policy service logs in Azure App Insight, is that by design or are we missing any configuration? we wonder if the monitoring of policy service in Azure deployment going well and how OSDU community managed it. Any response will be much appreciated.
@Srinivasan_Narayanan @nursheikh
Thank you!Shane HutchinsShane Hutchinshttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/issues/149Allow filtering within nested objects within array2024-02-06T15:11:38ZMykhailo BuriakAllow filtering within nested objects within array**AS IS**
- API doesn't support the filtering of objects within nested arrays
- In the example below API allows to filter only InitialWaterSaturation and EffectivePermeabilityToOil with the following operators: <=, <, >, >= BUT doesn't s...**AS IS**
- API doesn't support the filtering of objects within nested arrays
- In the example below API allows to filter only InitialWaterSaturation and EffectivePermeabilityToOil with the following operators: <=, <, >, >= BUT doesn't support the filtering of values within permeability array
<details><summary>Click to expand</summary>
"InitialWaterSaturation": {
"Value": 1,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:fraction"
},
"EffectivePermeabilityToOil": {
"Value": 456.25,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:millidarcies"
},
"Permeability": [
{
"PermeabilityType": "opendes:reference-data--PermeabilityMeasurementType:Air",
"Value": 387,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:millidarcies"
},
{
"PermeabilityType": "opendes:reference-data--PermeabilityMeasurementType:Specific",
"Value": 534,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:millidarcies"
}
</details>
**AS TO BE**
- System should support the filtering of nested values within arrays in all existing content schemasErnesto GutierrezErnesto Gutierrezhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/issues/145Review the `SamplesAnalysisID` values into bulk data2023-07-24T19:30:45ZSiarhei Khaletski (EPAM)Review the `SamplesAnalysisID` values into bulk dataCheck if the SamplesAnalysisID values are required in the bulk data.
The question for the team is do we still need to preserve master-data, etc. ids within bulk data?
**Note**: the initial idea was to provide bi-directional relation bet...Check if the SamplesAnalysisID values are required in the bulk data.
The question for the team is do we still need to preserve master-data, etc. ids within bulk data?
**Note**: the initial idea was to provide bi-directional relation between catalog and bulk data. Might be useful for vendors' apps.
The record example:
```json
{
"columns": [
"SamplesAnalysisID",
"SamplesID",
"BrineSaturation",
"ResistivityIndex",
"AdjustedResistivityIndex",
"SaturationExponent",
"CorrectedSaturationExponent",
"FormationFactorAtNetOverburdenPressure",
"AdjustedFormationFactorAtNetOverburnedPressure",
"BrineSalinity",
"QuantityOfOilInTheVolume"
],
"index": [
0
],
"data": [
[
"opendes:work-product-component--SamplesAnalysis:formationresistivityindexes-test:",
"osdu:master-data-Samples:dd76cf6c-226f-5636-ad1b-1ca0f8249cc8:",
{
"Value": 0.797,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:%25:"
},
1.73,
1.77,
2.42,
2.53,
6.87,
7.27,
{
"Value": 22000,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:ppm:"
},
{
"Value": 0.113,
"UnitOfMeasure": "opendes:reference-data--UnitOfMeasure:mL%2FmL:"
}
]
]
}
```Siarhei Khaletski (EPAM)Siarhei Khaletski (EPAM)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/102When deleting a subproject, blobs are deleted individually before the contain...2023-07-20T11:20:30ZMaggie SalakWhen deleting a subproject, blobs are deleted individually before the container is removedSteps to reproduce:
* Call the endpoint delete a subproject (DELETE /subproject/tenant/{tenantid}/subproject/{subprojectid})
* The blob container linked to the subproject should be deleted. In the current implementation all blobs inside...Steps to reproduce:
* Call the endpoint delete a subproject (DELETE /subproject/tenant/{tenantid}/subproject/{subprojectid})
* The blob container linked to the subproject should be deleted. In the current implementation all blobs inside the container are first deleted individually, before the container itself is removed. See the relevant [code section](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/blob/master/app/sdms/src/cloud/providers/azure/seistore.ts#L74).
Suggestions:
* Remove the blob deletion from the implementation and only delete the entire container.https://community.opengroup.org/osdu/platform/system/file/-/issues/89Unable to download the file2023-08-08T16:30:43ZSachin JaiswalUnable to download the file**Problem**
File download fails if file name contains the special character like ","
**Steps to reproduce**
- Generate a signedurl to upload a file (tesing,copy).
- Upload a file
- create a file metadata and use same filename (tesing,...**Problem**
File download fails if file name contains the special character like ","
**Steps to reproduce**
- Generate a signedurl to upload a file (tesing,copy).
- Upload a file
- create a file metadata and use same filename (tesing,copy)
- Generate a signedurl to download a file
- Copy and paste the signedurl in browser
**Error notice on bowser** - `<storage-url> sent an invalid response. ERR_RESPONSE_HEADERS_MULTIPLE_CONTENT_DISPOSITION `
**Proposed Solutions**
- File service - wrap quotes to file name before passing it to OS Core Lib Azure **OR**
- OS Core Lib Azure - Below changes in [BlobStore class](https://community.opengroup.org/osdu/platform/system/lib/cloud/azure/os-core-lib-azure/-/blob/master/src/main/java/org/opengroup/osdu/azure/blobstorage/BlobStore.java#:~:text=581-%2cblobServiceSasSignatureValues.setContentDisposition%28%22attachment%3b%20filename%3D%20%22%20%2B%20fileName%29%3b%2c-582) of OS Core Lib Azure repo.
`blobServiceSasSignatureValues.setContentDisposition("attachment; filename=\"" + fileName + "\"")`;https://community.opengroup.org/osdu/platform/system/storage/-/issues/178ADR: CosmosDb saturation/throttling when records reach too many versions2024-03-25T06:43:30ZAlok JoshiADR: CosmosDb saturation/throttling when records reach too many versions## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
***ISSUE***: Storage service stability issues due to too many versions of records.
***User behavior that causes this issue***: ...## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
***ISSUE***: Storage service stability issues due to too many versions of records.
***User behavior that causes this issue***: Creating a lot of versions for the same record ID. When multiple applications/teams do this long enough, we have too many versions for many records. There are no checks in place to prevent this scenario. We eventually hit infrastructure limits (i.e. CosmosDb document max size 2MB) but observe service instability much before.
***Why is this a problem***: Record versions are stored as part of record metadata. This is part of the `gcsVersionPaths` array. Each version is a string that represents the full path to the version's blob location. Record metadata is stored in CosmosDb. While CosmosDb has a hard size limit (2MB) for each document, this size is already too big when RU usage is considered. If we have hundreds or thousands of such records being updated, the total RU consumed is very high, incurring huge costs. This scenario poorly impacts service latency and availability. While not ideal, it is quite possible for applications to create versions of the same record for their workflows.
![image](/uploads/3f53fa471e7566a04d69ea539712db76/image.png)
For reference, here are some preliminary observations on the number of versions, size of the document and RU consumed to perform an UPSERT on a ***single*** document (note that the number of versions is not an ***absolute*** indicator to say how much RU will be consumed in performing an UPSERT, because its the size of the document that matters, and each version string can be of different length. One can fit a lot more versions if each version's length is small. However, as we stand today, it is the only metadata property that is causing documents to be big).
~1500 versions, ~300 RU consumed, ~243kb file size
~1500 versions, ~370 RU consumed, ~300kb file size
~3800 versions, ~1250 RU consumed, ~750kb file size
~5300 versions, ~1253 RU consumed, ~880kb file size
~9850 versions, ~2502 RU consumed, ~1.3mb file size
It is quite easy to have a few hundred or thousand records cripple the system once the records reach certain number of versions.
***CLARIFICATION***: The issue we observed is more specific to the Azure use case. Infrastructure limitations (i.e. cost to access a large document, hard limit on the size of the document) may vary per CSP (i.e. 2MB for CosmosDb, 1MB for GCP datastore). Other CSPs may see this issue once the number of versions reaches a certain number.
## Tradeoff Analysis
It is clear we want to limit the number of record versions. We see 2 ways to achieve this.
1. ***Set a hard limit*** on the number of versions on each record (say 1000) (preferred approach).
- Pros: Easy to implement, no behind-the-scenes magic.
- Cons: Breaking change for the existing workflows, when their records already have more than 1000 versions. Needs advance notice of breaking change and time for teams to update the workflows.
We can roll this out by first introducing a `deleteVersion` API in Storage that would give users time to delete older versions by themselves before breaking change is introduced so they don't break immediately.
2. ***Only keep 1000 recent versions***. For new records, this would mean actively start deleting the oldest version once we reach 1000 versions. For existing records with more than 1000 versions, this would mean cleaning up all older versions.
- Pros: Older versions are cleaned up for users automatically.
- Cons: Still a breaking change as older versions would get deleted automatically. Involves behind-the-scenes cleanup of older versions. For records that currently have more than 1000 records, this includes all remaining versions. There can be failure scenarios with cleanup and performance implications.
## Consequences
Storage will introduce a limit on the number of versions a record can have. Depending on the solution we choose, API will either fails after n versions (hard limit) OR older versions will get deleted automatically.M23 - Release 0.26Alok JoshiChad LeongThulasi Dass SubramanianOm Prakash GuptaAlok Joshihttps://community.opengroup.org/osdu/platform/data-flow/ingestion/osdu-airflow-lib/-/issues/6Need longer waiting time within EDS.2023-09-22T13:57:53ZBruce JinNeed longer waiting time within EDS.In EDS ingestion, it will wait 60 seconds for manifest ingestion dag to ramp up. But sometimes the 60s is not enough for the ingestion dag to update a task_status if the processing data is very big. As a result the EDS_ingest dag will fa...In EDS ingestion, it will wait 60 seconds for manifest ingestion dag to ramp up. But sometimes the 60s is not enough for the ingestion dag to update a task_status if the processing data is very big. As a result the EDS_ingest dag will fail, even the manifest_ingestion is actually succeeded.
Highly recommend to extend the waiting time, or make it adjustable.M21 - Release 0.24Nisha ThakranPriyanka BhongadeBruce JinNisha Thakranhttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/276Event Tracking/Timelines2024-02-13T16:28:15ZNoel OkanyaEvent Tracking/TimelinesAs a GCZ Product Owner, I want to prepare for the below events, so that we can provide the stakeholders with GCZ updates:
1. EAGE Digital in March 24, 2024 (no OSDU topic)
2. ERGIS event in April 24th - 25th, 2024 - Esri
3. OSDU F2F in E...As a GCZ Product Owner, I want to prepare for the below events, so that we can provide the stakeholders with GCZ updates:
1. EAGE Digital in March 24, 2024 (no OSDU topic)
2. ERGIS event in April 24th - 25th, 2024 - Esri
3. OSDU F2F in Europe April 24th week (No action from the team)