Storage issueshttps://community.opengroup.org/osdu/platform/system/storage/-/issues2023-05-30T08:55:39Zhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/157Storage Improperly local cached ORDC information from Legal service2023-05-30T08:55:39ZKelly ZhouStorage Improperly local cached ORDC information from Legal serviceCurrently Storage cached the first time result of valid ORDC from legal service regardless of which data partition user is trying to ingest record into, which could be wrong as we do support whitelisting countries for certain data partit...Currently Storage cached the first time result of valid ORDC from legal service regardless of which data partition user is trying to ingest record into, which could be wrong as we do support whitelisting countries for certain data partitions.
In order to fix that, we need to have data partition id information in the local cache for ORDC information.M16 - Release 0.19https://community.opengroup.org/osdu/platform/system/storage/-/issues/155GCP failing with core-common v0.18.0-rc42023-01-02T11:18:05ZMina OtgonboldGCP failing with core-common v0.18.0-rc4osdu-gcp-anthos-test integration tests are consistently failing when the core-common version is upgraded to v0.18.0-rc4.
Currently, gcp consumes 0.17.0 version of core-common which contains vulnerable libraries. The storage MR "Update ...osdu-gcp-anthos-test integration tests are consistently failing when the core-common version is upgraded to v0.18.0-rc4.
Currently, gcp consumes 0.17.0 version of core-common which contains vulnerable libraries. The storage MR "Update Storage to be Collaboration Context Aware" needs to consume a new version of core-common that exposes collaboration context. It is a blocker for this storage MR to be merged. As a quick fix for gcp test failure, we created a core-common that has collaboration context off of 0.17.0 version of core-common. The pipeline is passing with this version, which indicates that the gcp test failure is coming from the core-common version upgrade from 0.17.0 to 0.18.0-rc4.
References
* [Associated storage MR](https://community.opengroup.org/osdu/platform/system/storage/-/merge_requests/546)
* [Core-common MR](https://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/merge_requests/183)
* [ADR for the storage and core-common MRs](https://community.opengroup.org/osdu/platform/system/storage/-/issues/149)Yauhen Shaliou [EPAM/GCP]Yauhen Shaliou [EPAM/GCP]https://community.opengroup.org/osdu/platform/system/storage/-/issues/153Indexer fetch records requests should not be checked via OPA/Policy (Or any o...2023-03-06T10:20:12ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comIndexer fetch records requests should not be checked via OPA/Policy (Or any other service, that sends internal requests)**Problem:**
Currently, the Storage service will evaluate policies for service requests of the Indexer service, which doesn't make sense since the indexer should be able to fetch any record ingested to the platform.
Indexer fetch reque...**Problem:**
Currently, the Storage service will evaluate policies for service requests of the Indexer service, which doesn't make sense since the indexer should be able to fetch any record ingested to the platform.
Indexer fetch requests use common requests authentication flow when OPA integration is enabled:
https://community.opengroup.org/osdu/platform/system/storage/-/blob/master/storage-core/src/main/java/org/opengroup/osdu/storage/opa/service/OPAServiceImpl.java#L104
~~~
http://localhost:8181/v1/data/osdu/partition/osdu/dataauthz/records
{
"input": {
"operation": "view",
"token": "indexer-service-token",
"datapartitionid": "osdu",
"records": [{
"id": "osdu:master-data--Well:999907686759",
"kind": "osdu:wks:master-data--Well:1.0.0",
"legal": {
"legaltags": ["osdu-demo-legaltag"],
"otherRelevantDataCountries": ["US"],
"status": "compliant"
},
"acls": {
"viewers": ["data.default.viewers@osdu.osdu-gcp.go3-nrg.projects.epam.com"],
"owners": ["data.default.owners@osdu.osdu-gcp.go3-nrg.projects.epam.com"]
}
}
]
}
}
~~~
And it is possible that Indexer will not be authorized to fetch records:
~~~
HttpResponse(headers = {
null = [HTTP / 1.1 200 OK],
Content - Length = [305],
Date = [Tue, 29 Nov 2022 10: 58: 31 GMT],
Content - Type = [application / json]
}, body = {
"result": [{
"errors": [{
"code": 401,
"id": "osdu:master-data--Well:999907686759",
"message": "Legal response 401 {\"code\":401,\"reason\":\"Unauthorized\",\"message\":\"The user is not authorized to perform this action\"}",
"reason": "Error from compliance service"
}
],
"id": "osdu:master-data--Well:999907686759"
}
]
}, contentType = application / json, responseCode = 200, exception = null, request = http: //localhost:8181/v1/data/osdu/partition/osdu/dataauthz/records, httpMethod=POST, latency=812)
~~~
And will receive an empty response:
~~~
{
"records": [],
"notFound": [
"osdu:master-data--Well:999907686759"
],
"conversionStatuses": []
}
~~~
Which left records not indexed, and not searchable. Scenarios, when this occurrence happens, look quite easy to achieve, for example when the record uses ACLs that don't belong to the Service token.
**Solution:**
We need to bypass OPA\Policy authentication for internal service requests.M16 - Release 0.19Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRiabokon Stanislav(EPAM)[GCP]Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/152Upgrade azure-storage SDK2022-11-28T14:39:21ZNur SheikhUpgrade azure-storage SDKIn storage service we are using the azure-storage sdk 8.6.5 from com.microsoft.azure package which is too old and not having much support. It iis advisable to use the latest sdk for com.azure package.In storage service we are using the azure-storage sdk 8.6.5 from com.microsoft.azure package which is too old and not having much support. It iis advisable to use the latest sdk for com.azure package.https://community.opengroup.org/osdu/platform/system/storage/-/issues/151Storage service fails due to opa enabled value being true.2022-12-20T04:08:07ZNikhil Singh[MicroSoft]Storage service fails due to opa enabled value being true.2022-11-16 10:51:53.832 ERROR storage-6446654dcd-5m7cm --- [-nio-80-exec-52] o.o.o.a.l.Slf4JLogger correlation-id=fd7c531b-f76c-4467-a502-8860097b79a9 data-partition-id=opendes api-method=PUT operation-name={PUT [/reco...2022-11-16 10:51:53.832 ERROR storage-6446654dcd-5m7cm --- [-nio-80-exec-52] o.o.o.a.l.Slf4JLogger correlation-id=fd7c531b-f76c-4467-a502-8860097b79a9 data-partition-id=opendes api-method=PUT operation-name={PUT [/records], consumes [application/json], produces [application/json]} user-id=8b2a56ba-edf5-47ce-94b6-42c336ec8172 app-id=678fadf8-e5a8-46cd-a75d-4d6cc95d9bc9:storage.app error getting data authorization result {correlation-id=fd7c531b-f76c-4467-a502-8860097b79a9, data-partition-id=opendes} org.opengroup.osdu.core.common.model.http.AppException: error getting data authorization result| at org.opengroup.osdu.storage.opa.service.OPAServiceImpl.evaluateDataAuthorizationPolicy(OPAServiceImpl.java:125) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.opa.service.OPAServiceImpl.validateUserAccessToRecords(OPAServiceImpl.java:86) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]|
at org.opengroup.osdu.storage.service.IngestionServiceImpl.validateUserAccessAndCompliancePolicyConstraints(IngestionServiceImpl.java:415) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.service.IngestionServiceImpl.getRecordsForProcessing(IngestionServiceImpl.java:176) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.service.IngestionServiceImpl.createUpdateRecords(IngestionServiceImpl.java:98) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.provider.azure.service.IngestionServiceAzureImpl.createUpdateRecords(IngestionServiceAzureImpl.java:27) ~[classes!/:?]| at org.opengroup.osdu.storage.api.RecordApi.createOrUpdateRecords(RecordApi.java:80) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.api.RecordApi$$FastClassBySpringCGLIB$$495e8f0c.invoke(<generated>) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| ... suppressed 11 lines| at org.opengroup.osdu.storage.api.RecordApi$$EnhancerBySpringCGLIB$$a32ffde7.createOrUpdateRecords(<generated>) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| at org.opengroup.osdu.storage.api.RecordApi$$FastClassBySpringCGLIB$$495e8f0c.invoke(<generated>) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| ... suppressed 9 lines| at org.opengroup.osdu.storage.api.RecordApi$$EnhancerBySpringCGLIB$$1ec1cefc.createOrUpdateRecords(<generated>) ~[storage-core-0.15.1-SNAPSHOT.jar!/:?]| ... suppressed 2 lines| at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]| ... suppressed 18 lines| at org.opengroup.osdu.storage.util.StorageFilter.doFilter(StorageFilter.java:86) [storage-core-0.15.1-SNAPSHOT.jar!/:?]| ... suppressed 2 lines| at org.opengroup.osdu.azure.filters.TransactionLogFilter.doFilter(TransactionLogFilter.java:74) [core-lib-azure-0.17.0-rc14.jar!/:?]| ... suppressed 34 lines| at org.opengroup.osdu.azure.filters.Slf4jMDCFilter.doFilter(Slf4jMDCFilter.java:69) [core-lib-azure-0.17.0-rc14.jar!/:?]| ... suppressed 18 lines| at com.microsoft.applicationinsights.web.internal.WebRequestTrackingFilter.doFilter(WebRequestTrackingFilter.java:142) [applicationinsights-web-2.6.4.jar!/:?]| ... suppressed 18 lines|
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_332]|Nikhil Singh[MicroSoft]Nikhil Singh[MicroSoft]https://community.opengroup.org/osdu/platform/system/storage/-/issues/149ADR: Namespacing storage records2024-03-19T02:18:17Zashley kelhamADR: Namespacing storage records# Background
The OSDU is agreeing on a new EA level ADR for 'collaborations'. This is a wide ranging and broad problem that is trying to be solved. You can see info at the EA level [here](https://gitlab.opengroup.org/osdu/subcommittees/...# Background
The OSDU is agreeing on a new EA level ADR for 'collaborations'. This is a wide ranging and broad problem that is trying to be solved. You can see info at the EA level [here](https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/48).
At its heart is the idea that data must be separated between the system of record and system of engagement. Today the OSDU only supports the system of record. All data therefore by default resides in the system of record and the APIs we use read, write and delete from the system of record.
In this ADR we are looking at how we can separate data in Storage service into separate namespaces. These namespaces can in the future be linked to a specific collaboration, which will form the system of engagement.
The system of engagement is meant to be interacted with by any application wanting to add/update data into the OSDU. Therefore we should have some understanding of what application is making the requests into the system of engagement.
We are starting with storage service as all other changes needed for the system of engagement data separation will be driven by this change.
![image](/uploads/b269adeef9f11aa773480f96a4b7c7d7/image.png)
As shown, the system of engagement can have many namespaces, one for each collaboration.
A single storage record can reside in any number of namespaces. A namespace can also have 0 or many Records.
A storage record consists of 2 parts, the metadata and the data.
```
{
id: "opendes:mastered-wellbore:12345678",
kind: "osdu:wks:mastered-wellbore:1.0.0",
...
...
data: {
...
...
}
}
```
Everything inside the 'data' json object shown above is classed as the data and everything else is the 'metadata'.
These are stored separately by the storage service in a 1-many relationship. Every time a Records data is updated it creates a new version of that data that points to a single metadata instance.
The reference is held directly in the metadata. We can think of the referencing of the data blocks to the metadata like this
Diagram 1
![image](/uploads/ecdb68f32ab861835cca78533ed0716f/image.png)
The latest data version referenced is the 'head' and is returned by default when no version is specified when using the Storage APIs.
If I retrieve an older version of the 'data' I am only ever returned the same version of the metadata.
With collaboration there is the possibility that many 'heads' exist at the same time, one per collaboration. There can be many collaborations and each collaboration can hold many entities.
Each collaboration should be treated independently. therefore any change to a Record in the context of a collaboration should be reflected only in that context and not affect any others.
# Out of scope
For this ADR we are looking only at how we separate data in Storage service between the System of Record (what exists today in OSDU) and System of engagement (collaborations).
We are **not** deciding on
- How DDMS will separate the data
- How Consumption services like search separate the data
- How data will transfer between the system of Record and system of engagement in Storage
- How collaborations will act on this or control this behavior or even what a collaboration entity looks like
- Any other service that might need to act on a collaboration context e.g. ingestion
# Solution
The suggestion is to create a different instance of the Storage metadata specific to the collaboration context. It is stored using a compound key of the record id + the collaboration id.
This collaboration id forms the namespace for a record, and combining the 2 means we have a unique metadata instance per collaboration.
Therefore if a Record is not assigned to a collaboration the namespace is the same as it is today (empty) and the id remains unchanged. This maintains current system behavior for existing data in the system of record.
>Note: The Record ID is never changed between namespaces and should be persisted and returned to the user the same as it is today no matter the context provided. The id of the document/row used in the database should **append** the namespace value so that multiple metadata instances can coexist for the same Record ID. This means the data model of the metadata needs to have a separate record id and row/document id value.
References to the data are held in each metadata allowing the same data to be referenced by multiple namespaces but also to have unique versions of a record Id to exist in individual namespaces. The reference is also quick and cheap to add/remove from different namespaces.
Diagram 2
![image](/uploads/6df9c0249d22cf3cbdd34e3d9b1f096f/image.png)
>Note that multiple collaborations could be active at the same time and the 'data' versions does not have to be linear between them. For example changes from different collaborations could overlap one another. This is because the version is already defined as an epoch timestamp and so is versioned based on when it was created.
Diagram 3
![image](/uploads/d69b9d0fd9ffdfe6af3913c35bdc7b84/image.png)
### Behavior of retrieval APIs
If we take diagram 3 as the current state of a Record we can look at how different API requests to it should be handled with and without a collaboration context.
#### Getting latest in collaboration 1
```
curl -X 'GET' \
'<osdu>/api/storage/v2/records/<id>' \
--header 'x-collaboration: id=collaboration 1,application=<app-name>;' \
-- data-raw
```
Expected Result: V7 returned
#### Retrieving version 4 when no collaboration provided
```
curl -X 'GET' \
'<osdu>/api/storage/v2/records/<id>/versions/<version4>' \
-- data-raw
```
Expected Result: Error, version 4 does not exist
#### Retrieving version 4 when collaboration 2 provided
```
curl -X 'GET' \
'<osdu>/api/storage/v2/records/<id>/versions/<version4>' \
--header 'x-collaboration: id=collaboration 2,application=<app-name>;' \
-- data-raw
```
Expected Result: Error, version 4 does not exist
## Collaboration context header
The **x-collaboration** is an optional HTTP header that holds directives in requests instructing the Storage service to handle in context of the provided collaboration instance and not in the context of the system of record. We are designing it using directives so that is is more extensible overtime to incorporate other elements potentially needed by the collaboration feature set.
**NB: In the fullness of time many services will be impacted by the collaboration EA requirements. They could/should re-use this same header to support acting on a specific collaboration context for consistency and usability.**
### Syntax
Collaboration directives follow the validation rules below:
- Directives are case-insensitive but lowercase is recommended
- Multiple directives are comma-separated
### Request Directives
| Request | Description |
| ----------- | ----------- |
| id | Mandatory. The ID of the collaboration to handle the request against. |
| application | Mandatory. The name of the application sending the request. |
### Examples
#### Retrieve a specific version of a Record that exists in a collaboration
```
curl -X 'GET' \
'<osdu>/api/storage/v2/records/<record-id>/versions/<version>' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--header 'x-collaboration: id=<collaboration-id>,application=<app-name>;' \
--data-raw '
```
#### Retrieve a specific version of a Record that exists the system of record
We do not send a collaboration context here as it wants to access data from the system of record. This is the same request the user should be doing today.
```
curl -X 'GET' \
'<osdu>/api/storage/v2/records/<record-id>/versions/<version>' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--data-raw '
```
Note the given record id and version of the record must exist in both the system of record and the collaboration id for both API requests to return successfully.
### Record changed on namespace
To guarantee that the current system behavior is not changed we will create a new record changed topic that is triggered only when A record is edited in some way in context to a collaboration.
This means the existing record changed topic remains unchanged and is triggered only when changes are made in the system of record like they are today.
The new Record changed on namespace topic can then be bound to by downstream listeners over timer as and when they want to support the namespace concept.
The new message will also include the extra context information about the namespace. The message will be the same as the current record change message except it will include the new header
```
'''
x-collaboration: id=<id>,application=<app-name>;
'''
...
```
On top of this the new topic should be exposed through the Notification service so it can be registered to by external consumers as needed.
# Consequences
The storage service should support a new 'collaboration' header. Anytime a collaboration id is provided in this header the storage service should act only in that context. This should mean all storage APIs need to act specific to the collaboration context given, for creation, update, retrieval and deletion of records.
If no header is provided the Storage service should function the same as it does today and no change in behavior should be observed.
In the shared code section we will generate a new 'collaboration context' class that is passed into the CSP specific data layer. This property will have the collaboration id and application name. Each CSP should use this combined with the record id for the primary key of the metadata's data model. In this way the collaboration id forms the namespace of the record id so multiple metadata's can exist simultaneously.
We need a new 'Record changed collaboration' message and have it exposed through notification service
The hard delete API needs to validate all contexts before deleting the blob as multiple contexts could be referencing the same blob instanceM15 - Release 0.18ashley kelhamashley kelhamhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/148ADR: Separate modifyTime and modifyUser for every version of OSDU storage record2023-07-05T09:49:05ZMandar KulkarniADR: Separate modifyTime and modifyUser for every version of OSDU storage recordSeparate modifyTime and modifyUser for every version of OSDU storage record
## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [X] Approved
- [ ] Retired
## Context & Scope
The concept is that one record should have 1 versio...Separate modifyTime and modifyUser for every version of OSDU storage record
## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [X] Approved
- [ ] Retired
## Context & Scope
The concept is that one record should have 1 version of metadata.
However, in regard to modifyUser and modifyTime attributes, they should be different for each version.
Currently, the behaviors are as implemented, but the behavior by the above concept is wrong.
The original issue that was raised is [here](https://community.opengroup.org/osdu/platform/system/storage/-/issues/126).
So with the current behavior, for multiple versions of the same record modifyTime and modifyUser value are same and they are overwritten to all versions during every modification made to the record.
Which means for records having only 1 version, it is like below.
|version1|
|:-------|
|createUser|
|createTime|
But when the record is modified and multiple versions are created, the metadata of the record for latest version is applied to all versions including the first version as well.
|version1|version2 |version3|
|:-------|:--------|:--------|
|createUser| createUser| createUser|
|createTime| createTime| createTime|
|modifyUser2|modifyUser2|modifyUser2|
|modifyTime2|modifyTime2|modifyTime2|
Due to this behavior, the record modification history is lost and which versions of the record are created by which users cannot be tracked.
## Tradeoff Analysis
The metadata; which contains modifyUser, modifyTime attributes; will be stored separately against every record version.
This means the metadata stored for storage records will increase.
The record modification history can be tracked and which users created different versions of the record can be traced, which was not possible before.
## Decision
Version 1 should only have createUser and createTime. modifyUser and modifyTime should not exist in the first version.
Version 2+ should have different modifyUser and modifyTime for each version.
|version1|version2 |version3|
|:-------|:--------|:--------|
|createUser| createUser| createUser|
|createTime| createTime| createTime|
| |modifyUser1|modifyUser2|
| |modifyTime1|modifyTime2|
If the record meta-data (i.e. tags, legal tags and ACLs blocks from the record) is modified using storage **PATCH** API, version number is not changed and only the **latest** value for modifyUser and modifyTime will be maintained against that record version.
## Consequences
- Storage service behavior will change.
- Storage service documentation needs to be updated.M17 - Release 0.20Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/system/storage/-/issues/147Current implementation doesn't delete all versions of the record with purging...2023-07-19T19:31:54ZAlok JoshiCurrent implementation doesn't delete all versions of the record with purging a recordRepro steps:
- create a record with the PUT API
- create another version of the same record with the PUT API
- hard delete (purge) the record
Expected: All metadata and storage blobs should be purged
Actual: Metadata gets purged, but o...Repro steps:
- create a record with the PUT API
- create another version of the same record with the PUT API
- hard delete (purge) the record
Expected: All metadata and storage blobs should be purged
Actual: Metadata gets purged, but only latest version gets purged. This leaves dangling references of other versions in Blob Storage
**Note**: The bug was observed for Azure implementation, but other providers should confirm the behavior and put in a fix if requiredM15 - Release 0.18Alok JoshiAlok Joshihttps://community.opengroup.org/osdu/platform/system/storage/-/issues/145Storage not consuming OSDU record id regex in RecordAncestry2023-05-30T08:56:49ZKelly ZhouStorage not consuming OSDU record id regex in RecordAncestryStorage does not consume OSDU record id regex properly when it comes to parent records, current method to get parent record id and version number will cause error, i.e. dp-id:test:parent::1234, new OSDU record id regex allow colon in pre...Storage does not consume OSDU record id regex properly when it comes to parent records, current method to get parent record id and version number will cause error, i.e. dp-id:test:parent::1234, new OSDU record id regex allow colon in previous section while Storage didn't respect that rule yet.
Changes need to be made in core common library to add validator for RecordAncestry which consumes OSDU record id regex properly add Storage needs to update the way how it gets parent record id and version number.M15 - Release 0.18https://community.opengroup.org/osdu/platform/system/storage/-/issues/144BUG: Class cast exceptions into the CrsConversionService2022-09-30T10:09:53ZYauheni LesnikauBUG: Class cast exceptions into the CrsConversionServiceIn some of our envs we observed `ClassCastException ` in `CrsConversionService`.
Example:
```
"type": "java.lang.ClassCastException",
"message": "com.google.gson.JsonPrimitive cannot be cast to com.google.gson.JsonObject",
"p...In some of our envs we observed `ClassCastException ` in `CrsConversionService`.
Example:
```
"type": "java.lang.ClassCastException",
"message": "com.google.gson.JsonPrimitive cannot be cast to com.google.gson.JsonObject",
"parsedStack": [
{
"level": 0,
"method": "com.google.gson.JsonObject.getAsJsonObject",
"fileName": "JsonObject.java",
"line": 192
},
{
"level": 1,
"method": "org.opengroup.osdu.storage.conversion.CrsConversionService.getFeature",
"fileName": "CrsConversionService.java",
"line": 621
},
```Yauheni LesnikauYauheni Lesnikauhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/143Storage sends an exceeding number of legal tags for validation2022-11-10T16:53:40ZAn NgoStorage sends an exceeding number of legal tags for validationCompliance Validate Legal tags API has a limit of 25.
When an ingestion is done, Storage sends the provided legal tags to Compliance to ensure they are valid before proceeding with the record creation.
If there are more than 25 legal tag...Compliance Validate Legal tags API has a limit of 25.
When an ingestion is done, Storage sends the provided legal tags to Compliance to ensure they are valid before proceeding with the record creation.
If there are more than 25 legal tags being sent in the ingestion/creation request, then storage needs to split the requests into chunks of 25. However, it is not doing this check and sends over all of the legal tags included in the request.M14 - Release 0.17https://community.opengroup.org/osdu/platform/system/storage/-/issues/142[BUG] Incorrect Operation Type Published When Updating Record Kind When OPA i...2022-11-07T14:06:49ZMarc Burnie [AWS][BUG] Incorrect Operation Type Published When Updating Record Kind When OPA is EnabledWhen OPA is enabled for Storage service and updating a record's kind field, the record's previous kind is still observable in Search service.
For example, creating the following record using PUT {{base_url}}/api/storage/v2/records:
```...When OPA is enabled for Storage service and updating a record's kind field, the record's previous kind is still observable in Search service.
For example, creating the following record using PUT {{base_url}}/api/storage/v2/records:
```JSON
[
{
"id":"{{data_partition_id}}:dataset--File.Generic:1000",
"kind": "osdu:wks:dataset--File.Generic:1.0.0",
"data": {
"Endian": "BIG",
"Name": "dummy",
"DatasetProperties.FileSourceInfo.FileSource": "",
"DatasetProperties.FileSourceInfo.PreloadFilePath": ""
},
"namespace": "osdu:wks",
"legal": {
"legaltags": [
"{{data_partition_id}}-public-usa-dataset-1"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"acl": {
"viewers": [
"data.default.viewers@{{data_partition_id}}.{{domain}}"
],
"owners": [
"data.default.owners@{{data_partition_id}}.{{domain}}"
]
},
"type": "dataset--File.Generic",
"version": 1620833190423950
}
]
```
And updating the kind to be:
```JSON
[
{
"id":"{{data_partition_id}}:dataset--File.Generic:1000",
"kind": "osdu:wks:dataset--File.Generic:1.0.1",
...
}
]
```
It is expected that Search service would return the following result when making the request to POST {{base_url}}/api/search/v2/query:
Body:
```JSON
{
"kind": "osdu:wks:dataset--File.Generic:1.0.0"
}
```
Expected Result:
```JSON
{
"results": [],
"aggregations": [],
"totalCount": 0
}
```
However, the un-updated result is returned:
```JSON
{
"results": [
{
"kind": "osdu:wks:dataset--File.Generic:1.0.0",
"source": "wks",
"acl": {
"viewers": [
"data.default.viewers@osdu.example.com"
],
"owners": [
"data.default.owners@osdu.example.com"
]
},
"type": "dataset--File.Generic",
"version": 1663008428769106,
"tags": null,
"modifyUser": "admin@testing.com",
"modifyTime": "2022-09-12T18:47:08.806Z",
"createTime": "2022-09-09T19:12:46.378Z",
"authority": "osdu",
"namespace": "osdu:wks",
"legal": {
"legaltags": [
"osdu-public-usa-dataset-1"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"createUser": "admin@testing.com",
"id": "osdu:dataset--File.Generic:1000"
}
],
"aggregations": null,
"totalCount": 1
}
```
Performing a record query on Storage service returns the expected response, as well as Search service when searching for the updated kind.
The issue appears to be caused by the incorrect message being published by Storage service resulting in Indexer service only creating the new index for the updated record and not removing the record from the previous kind. Indexer knows to remove a record from an index during an update operation when there is a previousVersionKind field that is not null or empty. Storage publishes a message with this field when the updated records kind is different than the existing records kind. The existing records kind is overwritten to match the updated record when validating the record using OPA, so when the kind's are compared by Storage's IngestService, the result always evaluates to be a match and, therefore, the previousVersionKind field is never populated.M14 - Release 0.17Marc Burnie [AWS]Okoun-Ola Fabien HouetoGustavo UrdanetaMarc Burnie [AWS]https://community.opengroup.org/osdu/platform/system/storage/-/issues/141[BUG] Test failure when OPA is enabled due to legal response caching2022-11-07T14:07:03ZMarc Burnie [AWS][BUG] Test failure when OPA is enabled due to legal response cachingThe default legal rego policy specifies a caching period of 900 seconds. When OPA is enabled, it is possible to reuse deleted/invalidated legal tags during this period. The grace period specified in the PubSubEndpointTest is 10 seconds c...The default legal rego policy specifies a caching period of 900 seconds. When OPA is enabled, it is possible to reuse deleted/invalidated legal tags during this period. The grace period specified in the PubSubEndpointTest is 10 seconds causing the test to fail.
Related MR: https://community.opengroup.org/osdu/platform/system/storage/-/merge_requests/437/diffsM14 - Release 0.17Marc Burnie [AWS]Marc Burnie [AWS]https://community.opengroup.org/osdu/platform/system/storage/-/issues/140[BUG] Storage error message is different if OPA is enabled2022-11-07T14:06:29ZMarc Burnie [AWS][BUG] Storage error message is different if OPA is enabledStorage QueryServiceImpl throws a forbidden response with a different message if OPA is enabled when the requesting user does not belong to the viewer group: "The user does not have access to the record". With OPA disabled, the following...Storage QueryServiceImpl throws a forbidden response with a different message if OPA is enabled when the requesting user does not belong to the viewer group: "The user does not have access to the record". With OPA disabled, the following message is returned: "The user is not authorized to perform this action". The latter message is expected by the integration test.
Related to issue: https://community.opengroup.org/osdu/platform/system/storage/-/issues/133M14 - Release 0.17Marc Burnie [AWS]Marc Burnie [AWS]https://community.opengroup.org/osdu/platform/system/storage/-/issues/138Soft Delete APIs should enforce data owner access check2023-05-30T08:58:09ZKelly ZhouSoft Delete APIs should enforce data owner access checkFollowing endpoints only check for data viewer access currently:
- POST **/api/storage/v2/records/{id}:delete** (soft delete API)
- POST **/api/storage/v2/records/delete** (bulk delete API)
when user asks to soft delete the record, stor...Following endpoints only check for data viewer access currently:
- POST **/api/storage/v2/records/{id}:delete** (soft delete API)
- POST **/api/storage/v2/records/delete** (bulk delete API)
when user asks to soft delete the record, storage service should enforce the same level data access check as Purge API (DELETE /api/storage/v2/records/{id}), where only data owner can purge the record.
when the data access check is updated, we need to also update integration tests to reflect such changes in any related tests too.
As storage starts to integrate with Policy/OPA, we need to update corresponding data authz policies to reflect the changes as well.M14 - Release 0.17https://community.opengroup.org/osdu/platform/system/storage/-/issues/136Schema Validation Failed - Storage Service2022-11-21T11:11:21ZSamiullah GhousudeenSchema Validation Failed - Storage ServiceData ingestion through `Storage PUT service` does not validate schema, kind & attributes.
As in below request able to ingest `"TestAttribute": "Test-Sami"` attribute/value which is not defined in Contract type reference data - WKS schem...Data ingestion through `Storage PUT service` does not validate schema, kind & attributes.
As in below request able to ingest `"TestAttribute": "Test-Sami"` attribute/value which is not defined in Contract type reference data - WKS schema.
<details><summary> Storage PUT Request </summary>
<pre><code>
curl --location --request PUT 'https://osdu-ship.msft-osdu-test.org/api/storage/v2/records' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: opendes' \
--header 'Authorization: Bearer eyJ0eXAiOiJKV1Qi ' \
--data-raw '[
{
"id": "opendes:reference-data--ContractorType:test-sami01",
"kind": "osdu:wks:reference-data--ContractorType:1.0.0",
"acl": {
"owners": [
"data.default.owners@opendes.contoso.com"
],
"viewers": [
"data.default.viewers@opendes.contoso.com"
]
},
"legal": {
"legaltags": [
"opendes-public-usa-dataset-7643990"
],
"otherRelevantDataCountries": [
"US"
]
},
"data": {
"Name2": "Well",
"ID2": "Well",
"Code2": "Well",
"Source2": "Workbook Published/FacilityTypeType.1.0.0.xlsx; commit SHA 0b4db59a.",
"TestAttribute" : "Test-Sami"
}
}
]'
</code></pre>
</details>
<details><summary> Storage GET Request </summary>
<pre><code>
{
"data": {
"Name2": "Well",
"ID2": "Well",
"Code2": "Well",
"Source2": "Workbook Published/FacilityTypeType.1.0.0.xlsx; commit SHA 0b4db59a.",
"TestAttribute": "Test-Sami"
},
"meta": null,
"id": "opendes:reference-data--ContractorType:test-sami01",
"version": 1658769507968280,
"kind": "osdu:wks:reference-data--ContractorType:1.0.0",
"acl": {
"viewers": [
"data.default.viewers@opendes.contoso.com"
],
"owners": [
"data.default.owners@opendes.contoso.com"
]
},
"legal": {
"legaltags": [
"opendes-public-usa-dataset-7643990"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"createUser": "preshipping@azureglobal1.onmicrosoft.com",
"createTime": "2022-07-05T17:06:37.282Z",
"modifyUser": "preshipping@azureglobal1.onmicrosoft.com",
"modifyTime": "2022-07-25T17:18:28.992Z"
}
</code></pre>
</details>
Also, able to ingest and fetch data through Storage Service without creating schema `osdu:wks:reference-data--ContractorTypeTestSami:1.0.0 ` in OSDU system as noticed below :
<details><summary> Storage PUT Request </summary>
<pre><code>
curl --location --request PUT 'https://osdu-ship.msft-osdu-test.org/api/storage/v2/records' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: opendes' \
--header 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSU ' \
--data-raw '[
{
"id": "opendes:reference-data--ContractorTypeTestSami:test-sami01",
"kind": "osdu:wks:reference-data--ContractorTypeTestSami:1.0.0",
"acl": {
"owners": [
"data.default.owners@opendes.contoso.com"
],
"viewers": [
"data.default.viewers@opendes.contoso.com"
]
},
"legal": {
"legaltags": [
"opendes-public-usa-dataset-7643990"
],
"otherRelevantDataCountries": [
"US"
]
},
"data": {
"Name2": "Well",
"ID2": "Well",
"Code2": "Well",
"Source2": "Workbook Published/FacilityTypeType.1.0.0.xlsx; commit SHA 0b4db59a.",
"TestAttribute" : "Test-Sami"
}
}
]'
</code></pre>
</details>
<details><summary> Storage GET Request </summary>
<pre><code>
{
"data": {
"Name2": "Well",
"ID2": "Well",
"Code2": "Well",
"Source2": "Workbook Published/FacilityTypeType.1.0.0.xlsx; commit SHA 0b4db59a.",
"TestAttribute": "Test-Sami"
},
"meta": null,
"id": "opendes:reference-data--ContractorTypeTestSami:test-sami01",
"version": 1658770548926014,
"kind": "osdu:wks:reference-data--ContractorTypeTestSami:1.0.0",
"acl": {
"viewers": [
"data.default.viewers@opendes.contoso.com"
],
"owners": [
"data.default.owners@opendes.contoso.com"
]
},
"legal": {
"legaltags": [
"opendes-public-usa-dataset-7643990"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"createUser": "preshipping@azureglobal1.onmicrosoft.com",
"createTime": "2022-07-25T17:35:49.251Z"
}
</code></pre>
</details>
cc- @chad @debasischttps://community.opengroup.org/osdu/platform/system/storage/-/issues/135Storage release/0.15 build Failure2022-08-15T16:12:48ZShrikant GargStorage release/0.15 build FailureStorage release/0.15 build Failure start failing as it is referring to core-common 0.15.0-SNAPSHOT which is cleaned up. Ideally SNAPShot versions should be be referenced.
So upgrading it to latest version is recomendedStorage release/0.15 build Failure start failing as it is referring to core-common 0.15.0-SNAPSHOT which is cleaned up. Ideally SNAPShot versions should be be referenced.
So upgrading it to latest version is recomendedM12 - Release 0.15Shrikant GargShrikant Garghttps://community.opengroup.org/osdu/platform/system/storage/-/issues/134Storage and PUT - Any way to work around the limit of 500 records?2022-12-09T13:35:40ZDebasis ChatterjeeStorage and PUT - Any way to work around the limit of 500 records?I was trying to persist standard reference values for entity such as UnitOfMeasure and hit this limit.
```
{
"code": 400,
"reason": "Validation error.",
"message": "createOrUpdateRecords.records: Up to 500 records can be ing...I was trying to persist standard reference values for entity such as UnitOfMeasure and hit this limit.
```
{
"code": 400,
"reason": "Validation error.",
"message": "createOrUpdateRecords.records: Up to 500 records can be ingested at a time"
}
```
Is there something we can do for work around (apart from having to split the original JSON load manifest into smaller chunks)?
cc - @krveduru for informationhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/133[BUG] Create error messages based on response from OPA2022-09-07T21:28:20ZRostislav Vatolinvatolinrp@gmail.com[BUG] Create error messages based on response from OPAStorage has failing tests due to incorrect logic related to integration with OPA. Storage should not be responsible for creating a custom error message in case OPA returns error. Please use error message returned from OPA. Please make su...Storage has failing tests due to incorrect logic related to integration with OPA. Storage should not be responsible for creating a custom error message in case OPA returns error. Please use error message returned from OPA. Please make sure all integration tests are passing when opa is turned on.
Related MR: https://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/merge_requests/122Rostislav Vatolinvatolinrp@gmail.comRostislav Vatolinvatolinrp@gmail.comhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/132CORS blocking query/records:bacth endpoint2022-11-11T13:38:55ZYifan YeCORS blocking query/records:bacth endpointCORS does not allow frame-of-reference header and query/records:bacth endpoint is blocked by CORS.CORS does not allow frame-of-reference header and query/records:bacth endpoint is blocked by CORS.M14 - Release 0.17Yifan YeYifan Ye