OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2023-12-18T15:25:28Zhttps://community.opengroup.org/osdu/platform/system/home/-/issues/100ADR - Update Info API and Git tags to support core and CSP specific implement...2023-12-18T15:25:28ZOm Prakash GuptaADR - Update Info API and Git tags to support core and CSP specific implementation version## Decision Title
Update Info API to support core and CSP specific implementation version
## Summary
This proposes to directly support independent provider releases of services will reduce the amount of effort spent re-deploying
log...## Decision Title
Update Info API to support core and CSP specific implementation version
## Summary
This proposes to directly support independent provider releases of services will reduce the amount of effort spent re-deploying
logic to provider implementations that are unaffected by the fixes / patches.
This requires changing procedures around build configuration, release management, and reporting via the `/info` endpoint.
## Status
_Note that this ADR was originally proposed as #98, but was re-written to include details that emerged during the discussion and
approval process._
- [x] Proposed
- [x] Approved
- [x] Implementing (incl. documenting)
- [ ] Testing
- [ ] Released
## Context
Most of the services in the OSDU Data Platform use a Service Provider Interface abstraction.
This abstraction allows for the bulk of the service logic and implementation of the external API to be done in a common core
library, yet allow for individual service providers to implement the backend logic differently.
To this point, these various libraries (the core and each of the provider implementations) have been kept version synchronized --
exactly identical versions.
During the release process, the Git tag has reflected that common version to enable easy lookup of the code used.
And, for runtime inspection, a special endpoint (`/info`) reports back the version, commit SHA, and several other pieces of information.
## Problem statement
Sometimes new patches -- typically bug fixes / security upgrades -- apply only to one particular provider implementation.
When coupled with the common synchronized version, this causes the core library version (as well as all other provider
implementations) to change versions to match -- even though no changes occurred in these libraries.
Moreover, when the tags are made for the patch, other providers appear to be out of date.
This puts those providers in the position to choose between redeploying code that is functionally equivalent or leaving existing
versions and trying to document that the upgrade had no effect on their logic.
## Proposed solution
### Scope Limitations
**Patches Only**
This proposal limits provider-specific releases to patches only -- no support is provided for a minor (feature additive) or major
(breaking) change.
This limitation reduces the complexity of the release strategy immensely, and covers most use cases well.
**Maven Focused**
The proposal was designed with core services in mind, which are predominantly build using Java code and a Maven build system.
The strategies implement well in these cases, but may be extensible to other build systems in different ways.
To maintain simplicity, this ADR will only refer to projects that use Maven, and have an SPI abstraction layer.
**Services Only**
This proposal does not apply to libraries or utilities.
### Side Effect
**No Longer Proper Semantic Version Style**
Tags will now have extra information that doesn't exactly match the semantic versioning style.
In cases where tags have 4 numeric parts, though, they can be directly compared left to right as you would expect.
For example:
* `1.10.2-ABC.4` < `1.10.2-ABC.5`
* `1.10.2-ABC.4` < `1.10.3`
* `1.10.2-ABC.4` < `1.11.0`
* `1.10.2-ABC.4` < `1.11.0-ABC.2`
The addition of the text element identifying the provider implementation is the biggest variation from a standard semantic version.
Also, the provider patch number will always be greater than the core patch version (there cannot be a `1.10.3-ABC.2`, for instance).
The `/info` endpoint will consist of two separate version numbers, each of which is semantic style in itself.
However, the two versions are related and constrained by each other.
If the core version is `X.Y.N` and the provider is `X.Y.M`, then you know the service is for the service group version `X.Y`, there
have been `N` patches made to the core code, and `M - N` patches made to the provider logic.
Converting between the core + provider version pair and the Git tag will not be possible in general, but will be computable in
specific circumstances (such as the matching pair case -- `[X.Y.N, X.Y.N]`), and can also be converted using a list of all tags or
knowledge of the order in which patches were applied.
As an example, the pair `[X.Y.1, X.Y.2]` is ambiguous.
If the core patch came first, then the final tag would be `vX.Y.1-ABC.2`.
But, if the provider patch came first, then the final tag would be `vX.Y.1`.
### Build Configuration
**Summary**
* Provider-based POMs get a separate provider version, and the rest use a core version.
* Major + Minor will match between them
* Changes to a provider increments only the provider version
* Changes to core code increments ALL versions (core + all providers)
**Details**
The services are designed to have two distinct POM "trees", i.e. parent POMs plus a set of child modules.
One parent POM is stored in the repository root (for the main code), and one is in the `testing/` folder (for integration tests).
Each tree has a set of common code, which are organized as a code library and stored in `service-code` and
`testing/service-test-core` (where `service` is replaced with the specific service name).
Then, specific providers that have implementations show up in `providers/service-abc` and `testing/service-test-abc` (where, again,
`service` is replace with the service name and `abc` is replace with the provider name).
Both of the parent POMs and both of the core libraries should be synchronized to have identical version numbers, matching the
overall version of the service.
Then, for each provider implementation, the provider implementation and the provider testing library will be synchronized among
themselves.
This leads to a layout something like this:
```
+ pom.xml (CORE)
+ service-core/pom.xml (CORE)
+ providers
+ service-abc/pom.xml (ABC)
+ service-xyz/pom.xml (XYZ)
+ testing
+ pom.xml (CORE)
+ service-test-core/pom.xml (CORE)
+ service-test-abc/pom.xml (ABC)
+ service-test-xyz/pom.xml (XYZ)
```
Each file here is marked with the artifact version that it would use (by "artifact version", I mean `project.version` in the POM file).
Notice that the parent poms and the core libraries are all synchronized to use the same, base version of the service.
Each provider gets their own version, but will be synchronized between their main library and testing library.
Since only patches are permitted for provider specific versions, the first two version components will always match.
**Versioning Responsibility**
The Release Coordinators will manage all version changes as part of the release process.
MRs should not modify the artifact versions of the various POM files, nor the dependency versions on the self-built core libraries
-- that is, the core library for the same service.
They can, however, modify first-party or third-party library versions that come from outside sources, including os-core-common or a
provider's core common library.
`SNAPSHOT` library versions will still be used, so all artifact versions in all POM files will have a version ending in `-SNAPSHOT`.
On the default branch (`master` / `main`), the both the core versions and all provider versions will be set to the same value, since
all development here is occurring within the milestone.
The versions won't start to differentiate until a release is made -- and then will only differ on the release branch and tags.
**Every provider must use the latest core**
When a patch is made to the core library, it will be automatically applied to all providers.
We will not have any mechanism to skip a core patch for a particular provider or have a change made in the core library only apply
to a single provider.
Any change to the `service-core` code, everybody's version bumps and everybody redeploys.
This is not expected to be an onerous requirement -- patches are by their nature unlikely to cause compatibility issues.
If a particular patch would cause a lot of effort for a provider, that's a good indication that it isn't really a patch, but instead
a new feature that should wait for the next full milestone.
### Info Endpoint
**Summary**
* Each service's `/info` endpoint will need to report two different values -- the provider version and the core version
**Details**
All the core services have a special endpoint that returns information about the service, including the artifact version, the Git
commit ID, and more.
This endpoint will need to be modified to report two different fields -- `coreVersion` and `providerVersion` instead of just
`version`.
For example, you may see a return such as this:
```json
{
"groupId": "org.opengroup.osdu.indexer",
"artifactId": "indexer-azure",
"coreVersion": "1.10.0",
"providerVersion": "1.10.2",
"buildTime": "2023-01-17T12:30:00.500Z",
...
}
```
This work can be done independently from the release script / tagging changes.
Until complete, the reported version will be the core version only, which is acceptable during the transition.
### MR Procedure Changes
**Summary**
* Labels for providers (such as ~AWS, ~Azure, ~GCP, ~IBM) and for core code ( ~"Common Code") must be used in every MR
**Details**
These labels are already in widespread use to help teams determine proper reviewers for code.
This proposal extends this to be used automatically by release scripts to determine whether a particular MR cherry-pick should
increment the core version or a particular provider version.
### Release Process Changes
**Summary**
* Changes to provider code increments the provider patch number
* Changes to core code increments all patch numbers
* Tags with only provider changes have the core version and the provider name / patch number -- `vX.Y.N-ABC.M`
* Tags with core changes have only the core version -- `vX.Y.N`
**Details**
The default branches will have all versions (core and provider) set to be the version being built for the upcoming milestone,
similar to how it is done currently.
Release branches will begin with all versions the same, but can evolve over time based on the kinds of patches that come in.
Here's an example flow that shows a couple of different kinds of patches.
_For purposes of example, this assumes that the release is being prepared for version `1.10` of the Core Service Group._
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T1 | Create Release Branch | Branch, release/1.10 | 1.10.0-SNAPSHOT | 1.10.0-SNAPSHOT | 1.10.0-SNAPSHOT |
| T2 | Create Tag | Tag, v1.10.0 | 1.10.0 | 1.10.0 | 1.10.0 |
| T3 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
**T1 -- Create Release Branch**:
This is the very first step, where the `release/1.10` branch is created for the first time.
The core and provider versions will match, and will be inherited from the default branch.
Immediately after this step, the default branch would be prepared for the next release, setting versions to `1.11.0-SNAPSHOT`.
**T2 -- Create Tag**:
At this point, the release branch has been tested, and any lingering work has merged in.
Now, we create a "release commit" -- a commit based on the release branch that alters the artifact versions, removing the
`SNAPSHOT` part.
That commit is tagged, and since it is the initial release, it gets the simple `v1.10.0` tag.
**T3 -- Prepare for Next Patch**:
Once the tag is made, the release branch immediately prepares for the next patch.
Versions of core and provider POMs are incremented to the next patch numberu (but kept as a `SNAPSHOT`) -- we are "building towards"
version `1.10.1`, therefore it is `1.10.1-SNAPSHOT`.
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T3 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
| T4 | Cherry-pick Provider ABC Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
| T5 | Create Tag | Tag, v1.10.0-ABC.1 | 1.10.0 | 1.10.1 | 1.10.0 |
| T6 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
**T4 -- Cherry-pick Provider ABC Patch**:
In this step, we imagine a cherry-pick MR that patches only provider "ABC".
We approve and merge that MR into the release branch, but version numbers don't immediately change as part of that operation.
**T5 -- Create Tag**:
Once we're happy with the integration test results for the branch, we repeat the step to create the tag.
A new commit is made, dropping all the `SNAPSHOT` suffixes from the versions.
But crucially, if the particular library has not had any code changes then the version is set to match the previous release tag.
In this case, we see that "Core" and "XYZ" are set to `1.10.0`, even though the release branch the tag was created from was
`1.10.1-SNAPSHOT`.
This is a counterintuitive consequence of using a single release branch to work on several different libraries at the same time.
Then, the tag is created using the core version first, then the provider name, then the patch number of the provider version.
We don't need the full provider version, because we know that the major and minor numbers will match.
In this case, `v1.10.0-ABC.1` because it is an "ABC" patch, and ABC's version is `1.10.1`.
**T6 -- Prepare for Next Patch**:
Again, we prepare the release branch, incrementing the version for provider ABC to `1.10.2-SNAPSHOT`.
Other versions can remain at `1.10.1-SNAPSHOT` -- they are still "building towards" their first patch.
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T6 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
| T7 | Cherry-pick Core Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
| T8 | Create Tag | Tag, v1.10.1 | 1.10.1 | 1.10.2 | 1.10.1 |
| T9 | Prepare for Next Patch | Branch, release/1.10 | 1.10.2-SNAPSHOT | 1.10.3-SNAPSHOT | 1.10.2-SNAPSHOT |
**T7 -- Cherry-pick Core Patch**:
In this step, we imagine a cherry-pick MR that patches the core code.
We similarly approve and merge that MR into the release branch, and version numbers remain as they are.
**T8 -- Create Tag**:
Now that we're happy with the code, we create a tag commit.
In this commit, every version is incremented from the last release (effectively just remove the `SNAPSHOT` part).
With this commit, we tag it with the core version only -- in this case `v1.10.1`.
The ABC provider library is at version `1.10.2` here -- it's been patched twice, once by itself and once as a consequence of
patching the common code.
But, that information doesn't show up directly in the tag spelling, instead it will be in the `/info` endpoint, the code, and can
also be deduced from the list of previous tags.
We do it this way to avoid needing to make multiple tags per release, explicitly named each pairwise combination.
It is hoped that provider patches are rare enough that it is easier / better to use a simple single tag for all providers.
**T6 -- Prepare for Next Patch**:
One more time, we prepare the release branch, incrementing the version for all components to their next `SNAPSHOT` version.
**Final Results**
After all this, the ABC provider will see three tags: `v1.10.0`, `v1.10.0-ABC.1`, and `v1.10.1`.
All other providers will see two: `v1.10.0` and `v1.10.1`.M17 - Release 0.20Om Prakash GuptaOm Prakash Guptahttps://community.opengroup.org/osdu/platform/system/storage/-/issues/154Storage service stale in-memory cache leads to inconsistency.2023-02-15T18:37:33ZNikhil Singh[MicroSoft]Storage service stale in-memory cache leads to inconsistency.We recently uncovered a bug in storage service due to local cache getting stale. The flow can be understood by the following steps.
1. Deletion of a legal tag via legal service delete API --> response 204 No content after successful del...We recently uncovered a bug in storage service due to local cache getting stale. The flow can be understood by the following steps.
1. Deletion of a legal tag via legal service delete API --> response 204 No content after successful deletion
2. Storage service API call made at https://**********/api/storage/v2/push-handlers/legaltag-changed?token=*** --> Goes to a pod P1 of storage service --> Updates the records compliance for all the record associated with the deleted tag in step 1---> Removes the deleted tag from local cache of pod P1.
3. Storage PUT call to create a record with the deleted legal tag--> goes to a pod P2 of storage--> the cache still has that legal tag-->returns 201 created.
At step 3, all calls going to pod p1 returns "Invalid legal tag" but API calls landing on other pods successfully create these records.
The service ITs are failing in transient manner due to this issue.M17 - Release 0.20Nikhil Singh[MicroSoft]Nikhil Singh[MicroSoft]https://community.opengroup.org/osdu/platform/security-and-compliance/secret/-/issues/2To get multiple secret from Aws, Azure and GCP and disable listing all secret...2023-08-01T15:49:41ZJeyakumar DevarajuluTo get multiple secret from Aws, Azure and GCP and disable listing all secrets in AzureThe current secret service will either accept one key and fetch the value for the key from the Azure key vault or get the complete list from the key vault(Azure).
Challenge:
Any service request with multiple secrets has to hit the secr...The current secret service will either accept one key and fetch the value for the key from the Azure key vault or get the complete list from the key vault(Azure).
Challenge:
Any service request with multiple secrets has to hit the secret service with multiple requests.
Proposed Solution:
Enhance the secret service as per ADR to accept multiple keys in one go and provide multiple key-value pairs in Azure, AWS and GCP
Disable: Provision to list all the secrets from the vault will expose all the secrets
From ADR
* **List**: return the list of keys that are known (JK: As per my understanding, Passing the list of know keys will provide the respective values)
ADR
https://community.opengroup.org/osdu/platform/system/home/-/issues/75#functional-requirementsM17 - Release 0.20Jeyakumar DevarajuluJeyakumar Devarajuluhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/10Subproject name cannot contain "-" , it will result in error 404 even if uplo...2023-05-22T11:56:50ZChad LeongSubproject name cannot contain "-" , it will result in error 404 even if upload shows succesfulI created a subproject with "-" in the naming (chad-test) successfully using both python sdutil mk or directly through postman.
When I try to upload a file, the upload proceeds and when its done, it says successful but the return respon...I created a subproject with "-" in the naming (chad-test) successfully using both python sdutil mk or directly through postman.
When I try to upload a file, the upload proceeds and when its done, it says successful but the return response was error [404] [seismic-store-service] xxxx does not exist. No file was uploaded when I've try to list the directory.
```
File [ST10010ZC11_PZ_PSDM_KIRCH_FULL_D.MIG_FIN.POST_STACK.3D.JS-017536.segy] uploaded successfully
[404] [seismic-store-service] The dataset sd://osdu/chad-test/full_d.segy does not exist
```M17 - Release 0.20https://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/235GCZ doesn't have an Open API Spec2023-05-09T17:14:30ZMorris EstepaGCZ doesn't have an Open API SpecGCZ doesn't have an Open API Spec documenting all available APIs. See this for reference: https://community.opengroup.org/osdu/documentation/-/wikis/Core-Services-Overview#consumption-zoneGCZ doesn't have an Open API Spec documenting all available APIs. See this for reference: https://community.opengroup.org/osdu/documentation/-/wikis/Core-Services-Overview#consumption-zoneM18 - Release 0.21https://community.opengroup.org/osdu/platform/pre-shipping/-/issues/535While Testing Augmented search feature in AWS R3 M18 Pre-ship environment - n...2023-06-30T00:44:47ZKamlesh TodaiWhile Testing Augmented search feature in AWS R3 M18 Pre-ship environment - not able to get the expected results.Following these steps to test Augmented search feature
1. Make sure that the schema of kind "osdu:wks:reference-data--IndexPropertyPathConfiguration:1.0.0" is deployed. It should be as it is part of M18 schema. (**Executed sucessfully**...Following these steps to test Augmented search feature
1. Make sure that the schema of kind "osdu:wks:reference-data--IndexPropertyPathConfiguration:1.0.0" is deployed. It should be as it is part of M18 schema. (**Executed sucessfully**)
2. Make sure the feature flag "index-augmenter-enabled" is turned in the tested data partition (**Do not have access to execute this step**)
3. Select a few kinds of data that users want to create extended properties from related objects (**Selected Well, Wellbore, WellLog, WellboreTrajectory,WellboreMarkerSet**)
4. Define the property extension configuration in the data block of the records with kind "osdu:wks:reference-data--IndexPropertyPathConfiguration:1.0.0".
5. Deploy the configuration records to the storage via storage API
<details><summary>Configuration records created</summary>
{
"recordCount": 5,
"recordIds": [
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellLog:1.",
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellboreTrajectory:1.",
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellboreMarkerSet:1.",
"osdu:reference-data--IndexPropertyPathConfiguration:wks:master-data--Well:1.",
"osdu:reference-data--IndexPropertyPathConfiguration:wks:master-data--Wellbore:1."
],
"skippedRecordIds": [],
"recordIdVersions": [
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellLog:1.:1687552840965025",
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellboreTrajectory:1.:1687552840965025",
"osdu:reference-data--IndexPropertyPathConfiguration:work-product-component--WellboreMarkerSet:1.:1687552840965025",
"osdu:reference-data--IndexPropertyPathConfiguration:wks:master-data--Well:1.:1687552840965025",
"osdu:reference-data--IndexPropertyPathConfiguration:wks:master-data--Wellbore:1.:1687552840965025"
]
}
</details>
<details><summary>Retrieved Well configuration record for verification </summary>
{
"data": {
"Name": "Well-IndexPropertyPathConfiguration",
"Description": "The index property list for master-data--Well:1., valid for all master-data--Well kinds for major version 1.",
"Code": "osdu:wks:master-data--Well:1.",
"AttributionAuthority": "OSDU",
"Configurations": [
{
"Name": "CountryNamesKTJun23",
"Policy": "ExtractAllMatches",
"Paths": [
{
"RelatedObjectsSpec": {
"RelationshipDirection": "ChildToParent",
"RelatedObjectID": "data.GeoContexts[].GeoPoliticalEntityID",
"RelatedObjectKind": "osdu:wks:master-data--GeoPoliticalEntity:1.",
"RelatedConditionMatches": [
"osdu:reference-data--GeoPoliticalEntityType:Country:"
],
"RelatedConditionProperty": "data.GeoContexts[].GeoTypeID"
},
"ValueExtraction": {
"ValuePath": "data.GeoPoliticalEntityName"
}
}
],
"UseCase": "As a user I want to find objects by a country name, with the understanding that an object may extend over country boundaries."
},
{
"Name": "WellUWIKTJun23",
"Policy": "ExtractFirstMatch",
"Paths": [
{
"ValueExtraction": {
"RelatedConditionMatches": [
"osdu:reference-data--AliasNameType:UniqueIdentifier:",
"osdu:reference-data--AliasNameType:RegulatoryName:",
"osdu:reference-data--AliasNameType:PreferredName:",
"osdu:reference-data--AliasNameType:CommonName:",
"osdu:reference-data--AliasNameType:ShortName:"
],
"RelatedConditionProperty": "data.NameAliases[].AliasNameTypeID",
"ValuePath": "data.NameAliases[].AliasName"
}
}
],
"UseCase": "As a user I want to discover and match Wells by their UWI. I am aware that this is not globally reliable, however, I am able to specify a prioritized AliasNameType list to look up value in the NameAliases array."
}
]
},
"meta": [],
"modifyUser": "admin-main@testing.com",
"modifyTime": "2023-06-23T20:40:41.335Z",
"id": "osdu:reference-data--IndexPropertyPathConfiguration:wks:master-data--Well:1.",
"version": 1687552840965025,
"kind": "osdu:wks:reference-data--IndexPropertyPathConfiguration:1.0.0",
"acl": {
"viewers": [
"data.default.viewers@osdu.example.com"
],
"owners": [
"data.default.owners@osdu.example.com"
]
},
"legal": {
"legaltags": [
"osdu-AugmIdxExt-Legal-Tag-Test"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"createUser": "admin-main@testing.com",
"createTime": "2023-06-15T17:35:37.889Z"
}
</details>
6. Re-index all the kinds that have extended properties using the reindex API
<details><summary>re-index --Well:1.0.0, --Well:1.1.0, --Well:1.2.0</summary>
curl --location 'https://osdu.r3m18.preshiptesting.osdu.aws/api/indexer/v2/reindex?force_clean=true' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: osdu' \
--header 'Authorization: Bearer eyJraWQiOi...truncated...CG4HUDHg' \
--data '{
"kind": "osdu:wks:master-data--Well:1.0.0"
}'
Response 200 OK
</details>
7. Test search with the extended properties
<details><summary>Search and it's results</summary>
curl --location 'https://osdu.r3m18.preshiptesting.osdu.aws/api/search/v2/query' \
--header 'Authorization: Bearer eyJraWQiOi...Truncated...BAy-bDbtQ' \
--header 'data-partition-id: osdu' \
--header 'Content-Type: application/json' \
--data '{
"kind": "osdu:wks:master-data--Well:1.*",
"query": "_exists_:data.WellUWIKTJun23",
"returnedFields": ["id", "kind", "data.WellUWIKTJun23"]
}'
Response 200 OK
{
"results": [],
"aggregations": [],
"totalCount": 0
}
</details>
Also tried similar steps in GC where we have access to look at whether the feature is available or not. We found that feature is not enabled
and do not have permissions/access to enable the feature.
<details><summary>In GC R3 M18 Pre-ship environment</summary>
curl --location 'https://preship.gcp.gnrg-osdu.projects.epam.com/api/partition/v1/partitions/odesprod' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: odesprod' \
--header 'Authorization: Bearer ya29.a0AWY7Ckmj...truncated...fpwHiQ0167' \
--data ''
Response 200 OK
{
"kubernetes-secret-name": {
"sensitive": false,
"value": "eds-odesprod"
},
"elasticsearch.password": {
"sensitive": true,
"value": "ELASTIC_PASS_ODESPROD"
},
"serviceAccount": {
"sensitive": false,
"value": "datafier@osdu-service-prod.iam.gserviceaccount.com"
},
"dataPartitionId": {
"sensitive": false,
"value": "odesprod"
},
"bucket": {
"sensitive": false,
"value": "osdu-data-prod-odesprod-records"
},
"index-augmenter-enabled": {
"sensitive": false,
"value": "false"
},
...Truncated...
{
"indexer.service.account": {
"sensitive": false,
"value": "workload-indexer-gcp@osdu-service-prod.iam.gserviceaccount.com"
},
"projectId": {
"sensitive": false,
"value": "osdu-data-prod"
}
}
When I try to enable the index-augmenter, I get the response of 403 Forbidden - RBAC: access denied.
</details>M18 - Release 0.21Dzmitry Malkevich (EPAM)Yong ZengDzmitry Malkevich (EPAM)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/issues/71IBM Wellbore Domain Services Integration test cases are failing.2023-07-10T12:49:38Zvikas ranaIBM Wellbore Domain Services Integration test cases are failing.https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/jobs/1992734.https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/jobs/1992734.M18 - Release 0.21vikas ranavikas ranahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/issues/153IBM workflow integration test failing - for M182023-06-01T09:52:21Zvikas ranaIBM workflow integration test failing - for M18https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/jobs/1996356https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/jobs/1996356M18 - Release 0.21vikas ranavikas ranahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/24EDS - Display Osdu_ingest run ID in eds_ingest Xcom Summary2023-05-23T08:14:18ZPriyanka BhongadeEDS - Display Osdu_ingest run ID in eds_ingest Xcom SummaryM18 - Release 0.21Priyanka BhongadePriyanka Bhongadehttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/23EDS - Adding more description to logger in eds_ingest2023-05-04T10:42:00ZPriyanka BhongadeEDS - Adding more description to logger in eds_ingest1. Include status code in logger after POST and GET Request
2. include description to logger to understand flow of eds_ingest
3. Include CSRE and CSDJ IDs in logger1. Include status code in logger after POST and GET Request
2. include description to logger to understand flow of eds_ingest
3. Include CSRE and CSDJ IDs in loggerM18 - Release 0.21Priyanka BhongadePriyanka Bhongadehttps://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/81ADR: Configurable Index Extensions and De-Normalizations2024-02-14T18:00:03ZThomas Gehrmann [slb]ADR: Configurable Index Extensions and De-Normalizations<a name="TOC"></a>
[[_TOC_]]
Originally recorded during June 28-30, 2022 F2F as "Hints replacements, multiple index schemas (participation of indexer
& data definition needs to be in charge), content vs catalog, side-car", then renamed...<a name="TOC"></a>
[[_TOC_]]
Originally recorded during June 28-30, 2022 F2F as "Hints replacements, multiple index schemas (participation of indexer
& data definition needs to be in charge), content vs catalog, side-car", then renamed to ADR: User-friendly/App-friendly
Index Schemas
in [Enterprise Architecture ADR #66](https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/66)
<details>
<summary markdown="span">Preparation Material</summary>
OSDU Data Definitions conducted a number of sessions in the Core Concepts meetings, which contain supplementary
information:
**2022**
1. [Meeting Minutes 2022-07-05](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2022/2022-07-05-DataDefinitionsCoreConcepts_MeetingMinutes.md#42-user-friendly-schemas-de-normalizations)
2. [Meeting Minutes 2022-07-12](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2022/2022-07-12-DataDefinitionsCoreConcepts_MeetingMinutes.md#43-user-friendly-schemas-aka-index-schemas)
3. [Meeting Minutes 2022-07-19](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2022/2022-07-19-DataDefinitionsCoreConcepts_MeetingMinutes.md#43-user-friendly-schemas-aka-index-schemas)
4. [Meeting Minutes 2022-07-26](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2022/2022-07-26-DataDefinitionsCoreConcepts_MeetingMinutes.md#42-user-friendly-schemas-aka-index-schemas)
**2023**
1. [Meeting Minutes 2023-03-21](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2023/2023-03-21-DataDefinitionsCoreConcepts_MeetingMinutes.md#42-index-extensions-adr-66-configuration)
2. [Meeting Minutes 2023-03-28](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/core-concepts/docs/-/blob/master/Meeting%20Minutes/2023/2023-03-28-DataDefinitionsCoreConcepts_MeetingMinutes.md#42-index-extensions-configuration-mechanics-schema-review)
3. [Enterprise Architecture Advice Forum 2023-04-12](https://opensdu.slack.com/archives/C04TPV9CRUP/p1681291140407219?thread_ts=1681217870.084929&cid=C04TPV9CRUP)
</details>
# Status
- [x] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
# Context & Scope
The entity type schemas delivered by the OSDU Data definitions subcommittee pose a number of challenges
for consumers. Most of them are due to the normalization of schemas and the friendliness to ingestors, which allows
storage of values as is and less standardized. The main problem is the usage of arrays of objects, which are difficult
when forming queries and cause costs for indexing. So far the issues have been mitigated by decorating arrays of objects
with `x-osdu-indexing` instructions. An umbrella issue has been recorded in
[community DD issue #30](https://community.opengroup.org/osdu/data/data-definitions/-/issues/30), which collects a
numer of more detailed requests.
In previous OSDU prototypes, this was addressed by specific workarounds,
see [OSDU R1 Indexing Approach and Specification](https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/wikis/uploads/46b4f84f0903cc385abd147a0175a00a/r1_indexing.pdf).
Here an attempt to classify the workarounds listed in the R1 document above:
1. Extraction of standardized values from arrays of objects using conditions (e.g., Well UWI, SpudDate).
2. Chasing relationships to parent or related objects in order to de-normalize parent/related object values on children.
3. Offering related object's Name/Code for presentations in applications.
4. Counting children of well-known kinds. (The priority of this is lower compared to 1 and 2. The current Search service
should be capable of performing querying a particular parent-child relationship.)
The current methods using `x-osdu-virtual-properties`, `x-osdu-is-derived` and `x-osdu-indexing` JSON schema decorations
fall short when the query conditions become dependent on platform operators usage of, e.g., reference values. In many
cases the reference value lists shipped by OSDU are incomplete or not clearly enough documented to guide global platform
standards.
[Back to TOC](#TOC)
---
## Requirements
* We need a configurable way to define rules for property extraction, either from nested arrays of objects or from
related objects.
* We need OSDU provided standard index schema extensions to extend the entity types schemas with extracted values. (
Governance for interoperability)
* We need to open the index schema extensions to applications and services to optimize frequently used query patterns.
One of them is the look-up of names or codes of related objects where the source record holds the target record id.
* We need a platform embedded service, which performs the extractions and de-normalizations on demand (data
creation/update events)
* we need platform support to refresh indexes if the indexing schemas change (both for OSDU and application indexing
schemas).
[Back to TOC](#TOC)
---
# Tradeoff Analysis
The original tradeoff analysis was performed and recorded
in [EA ADR #66](https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/66).
The need for performance required further simplification.
* Replicating derived/de-normalized property values in Storage records was discarded as this would create an enormous
stack of versions for each individual record as records would need to be updated if properties derived from parents or
children changed.
* Instead, de-normalization could happen exclusively in the indexer, simultaneously exploiting the already indexed
values of parent and children records. (Preferred option)
* Using configurable index extension rules was already proposed
in [EA ADR #66](https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/66). The
proposed additional index schemas with references to configurations were discarded. All required information can be
encoded in the configurations themselves. Any index extension schema fragments and documentation can be auto-generated
from the configurations.
* Interoperability is achieved by firm governance rules - the configurations are stored and customizable as OPEN
governance reference-data. However, additional governance rules have to be provided to keep interoperability
guaranteed across deployments and to prevent unwanted interference of index extensions with actual schema properties.
[Back to TOC](#TOC)
---
# Solution
## Index Extension, Data Definition
OSDU Standard index extensions are defined by OSDU Data Definition work-streams with the intent to provide
user/application friendly, derived properties. The standard set, together with the OSDU schemas, form the
interoperability foundation. They can contribute to deliver domain specific APIs according to the Domain Driven Design
principles.
The configurations are encoded in OSDU reference-data records, one per each major schema version. The proposed type name
is IndexPropertyPathConfiguration. The diagram below shows the decomposition into parts.
![IndexPropertyPathConfiguration](/uploads/7f1330dd7a41903a90174feb7fe2c9d9/IndexPropertyPathConfiguration.png)
* One IndexPropertyPathConfiguration record corresponds to one schema kind's major version, i.e., the
IndexPropertyPathConfiguration record id for all the `schema osdu:wks:master-data--Wellbore:1.*.*` kinds is set
to `partition-id:reference-data--IndexPropertyPathConfiguration:osdu:wks:master-data--Wellbore:1`. Code, Name and
Descriptions are filled with meaningful data as usual for all reference-data types.
* The additional index properties are added with one JSON object each in the `Configurations[]` array. The Name defined
the name of the index 'column', or the name of the property one can search for. The Policy decides, in the current
usage, whether the resulting value is a single value or an array containing the aggregated, derived values.
* Each `Configurations[]` element has at least one element defined in `Paths[]`.
* The `ValueExtraction` object has one mandatory property, `ValuePath`. The other optional two properties hold value
match conditions, i.e., the property containing the value to be matched and the value to match.
* If no `RelatedObjectsSpec` is present, the value is derived from the object being indexed.
* If `RelatedObjectsSpec` is provided, the value extraction is carried out in related objects - depending on
the `RelationshipDirection` indirection parent/related object or children. The property holding the record id to
follow is specified in `RelatedObjectID`, so is the expected target kind. As in `ValueExtraction`, the selection can
be filtered by a match condition (`RelatedConditionProperty` and `RelatedConditionMatches`)
With this, the extension properties can be defined as if they were provided by a schema.
Most of the use cases deal with text (string) types. The definition of configurations is however not limited to string
types. As long as the property is known to the indexer, i.e., the source record schema is describing the types, the type
can be inferred by the indexer. This does not work for nested arrays of objects, which have not been indexed
with `"x-osdu-indexing": {"type":"nested"}`. In this case the types unknown to teh Indexer Service are
string-serialized; the resulting index type is then of type `string`, still supporting text search.
[Back to TOC](#TOC)
---
### Use Case 1, WellUWI
_As a user I want to discover and match Wells by their UWI. I am aware that this is not globally reliable, however, I am
able to specify a prioritized AliasNameType list to look up value in the NameAliases array._
The configuration demonstrates extractions from the record being indexed itself. With Policy `ExtractFirstMatch`, the
first value matching the condition `RelatedConditionProperty` is equal to one of `RelatedConditionMatches`.
<details><summary>Configuration for Well, extract WellUWI from NameAliases[]</summary>
```json
{
"data": {
"Configurations": [
{
"Name": "WellUWI",
"Policy": "ExtractFirstMatch",
"Paths": [
{
"ValueExtraction": {
"RelatedConditionMatches": [
"{{data-partition-id}}:reference-data--AliasNameType:UniqueIdentifier:",
"{{data-partition-id}}:reference-data--AliasNameType:RegulatoryName:",
"{{data-partition-id}}:reference-data--AliasNameType:PreferredName:",
"{{data-partition-id}}:reference-data--AliasNameType:CommonName:"
],
"RelatedConditionProperty": "data.NameAliases[].AliasNameTypeID",
"ValuePath": "data.NameAliases[].AliasName"
}
}
],
"UseCase": "As a user I want to discover and match Wells by their UWI. I am aware that this is not globally reliable, however, I am able to specify a prioritized AliasNameType list to look up value in the NameAliases array."
}
]
}
}
```
</details>
[Back to TOC](#TOC)
---
### Use Case 2, CountryNames
_As a user I want to find objects by a country name, with the understanding that an object may extend over country
boundaries._
This configuration demonstrates the extraction from related index objects - here `RelatedObjectKind`
being `osdu:wks:master-data--GeoPoliticalEntity:1.`, which are found via `RelatedObjectID` as
in `data.GeoContexts[].GeoPoliticalEntityID`. The condition is constrained to be that GeoTypeID is
GeoPoliticalEntityType:Country.
<details><summary>Configuration for Well, extract CountryNames from GeoContexts[]</summary>
```json
{
"data": {
"Configurations": [
{
"Name": "CountryNames",
"Policy": "ExtractAllMatches",
"Paths": [
{
"RelatedObjectsSpec": {
"RelatedObjectID": "data.GeoContexts[].GeoPoliticalEntityID",
"RelatedObjectKind": "osdu:wks:master-data--GeoPoliticalEntity:1.",
"RelatedConditionMatches": [
"{{data-partition-id}}:reference-data--GeoPoliticalEntityType:Country:"
],
"RelatedConditionProperty": "data.GeoContexts[].GeoTypeID"
},
"ValueExtraction": {
"ValuePath": "data.GeoPoliticalEntityName"
}
}
],
"UseCase": "As a user I want to find objects by a country name, with the understanding that an object may extend over country boundaries."
}
]
}
}
```
</details>
[Back to TOC](#TOC)
---
### Use Case 3, Wellbore Name on WellLog Children
_As a user I want to discover WellLog instances by the wellbore's name value._
A variant of this can be WellUWI from parent Wellbore → Well; in that case the value would be derived from the
already extended index values.
This configuration demonstrates extractions from multiple `Paths[]`.
<details><summary>Configuration for WellLog, extract WellboreName from parent WellboreID</summary>
```json
{
"data": {
"Configurations": [
{
"Name": "WellboreName",
"Policy": "ExtractFirstMatch",
"Paths": [
{
"RelatedObjectsSpec": {
"RelatedObjectKind": "osdu:wks:master-data--Wellbore:1.",
"RelatedObjectID": "data.WellboreID"
},
"ValueExtraction": {
"ValuePath": "data.VirtualProperties.DefaultName"
}
},
{
"RelatedObjectsSpec": {
"RelatedObjectKind": "osdu:wks:master-data--Wellbore:1.",
"RelatedObjectID": "data.WellboreID"
},
"ValueExtraction": {
"ValuePath": "data.FacilityName"
}
}
],
"UseCase": "As a user I want to discover WellLog instances by the wellbore's name value."
}
]
}
}
```
</details>
[Back to TOC](#TOC)
---
### Use Case 4, Wellbore index WellLogCurveMnemonics
_As a user I want to find Wellbores by well log mnemonics._
This configuration demonstrates the Policy `ExtractAllMatches` with related objects discovered by
RelationshipDirection `ParentToChildren`, i.e., related objects referring the indexed record.
<details><summary>Configuration for WellLog, extract WellboreName from parent WellboreID</summary>
```json
{
"data": {
"Configurations": [
{
"Name": "WellLogCurveMnemonics",
"Policy": "ExtractAllMatches",
"Paths": [
{
"RelatedObjectsSpec": {
"RelationshipDirection": "ParentToChildren",
"RelatedObjectID": "WellboreID",
"RelatedObjectKind": "osdu:wks:work-product-component--WellLog:1."
},
"ValueExtraction": {
"ValuePath": "Curves[].Mnemonic"
}
}
],
"UseCase": "As a user I want to find Wellbores by well log mnemonics."
}
]
}
}
```
</details>
[Back to TOC](#TOC)
---
## Index Extension, Governance
OSDU Data Definition ships reference value list content for all reference-data group-type entities. The type
IndexPropertyPathConfiguration is classified as OPEN governance, which usually means that new records can be added by
platform operators. This rule must be adjusted for IndexPropertyPathConfiguration records.
### Permitted Changes to IndexPropertyPathConfiguration Records
It is permitted to
* customize the conditions for value extractions, notable the matching values in `RelatedConditionMatches`.
* add additional `Paths[]` elements to `Configurations[].Paths[]`
* add new index property configuration objects to the `Configurations[]` array. To avoid interference with future OSDU
updates it is strongly recommended to add a namespace prefix to the Configurations[].Name, e.g., "OperatorX.WellUWI".
### Prohibited Changes to IndexPropertyPathConfiguration Records
It is not permitted to
* change the target value type of existing, OSDU shipped index extensions. Example the `ExtractionPath` to a string
property in the original OSDU `Configurations[].ValueExtraction.ValuePath` must not be altered to a number, integer,
or array.
* change the meaning of existing, OSDU shipped index extensions.
* remove OSDU shipped extension definitions in Configurations[].
[Back to TOC](#TOC)
---
## Consumption by Indexer Service
### Recursive Index Updates
With the introduction of de-normalizations record updates can cause infinite recursions. The implementation needs to
address this and avoid situations like in the following diagram:
![Recursions](/uploads/020675583cb7b65560f0d73ffe08fc3c/Recursions.png)
On the left hand Storage records are updated to new versions, which trigger indexing. The update of the index triggers
the index update of related index records due to the derived property values (as defined in the `RelatedObjectsSpec`).
These updates may, in turn, cause a recursion. This must not happen.
The augmenter introduces a new attribute `ancestry_kinds` in the Attributes map of the message payload when sending
messages to update the index of parent/children records. The value of `ancestry_kinds` attribute can include multiple
kinds separated by comma. This new attribute is used to prevent infinite loop of the index chasing. The indexer-queue
must pass the attribute back to the indexer when it receives indexing messages.
### Pseudo-Code
1. For each record to be indexed (create/update event from Storage service):
* Has the record kind a IndexPropertyPathConfiguration?
* Yes
* get or create the internal index schema that combines the schema of the record kind and schema of extended
properties
* create index document that combines the properties of original record and extended properties
* call ElasticSearch service to create or update the index of the record with extended properties
* No
* **_No action_** (=default for records without IndexPropertyPathConfiguration)
2. Re-Indexing (create/update event from Storage service for a IndexPropertyPathConfiguration record)<br>
To update the schema (or say template) of the kind in ElasticSearch when the kind is re-indexed:
* create the internal index schema derived from the kind (as registered in the Schema service)
* create the internal index schema derived from IndexPropertyPathConfiguration
* merge the internal index schemas
* convert the schema to ElasticSearch template
* call ElasticSearch service to update the index template (schema)
[Back to TOC](#TOC)
---
## Accepted Limitations
* A change in the configurations requires re-indexing of all the records of a major schema version kind. It is the same
limitation as an in-place schema change for any kind.
* All the extensions defined in the IndexPropertyPathConfiguration records refer to properties in the `data` block,
including `ValuePath`, `RelatedObjectID`, `RelatedConditionProperty`.
* Only properties in the `data` block of records being indexed can be reached by the `ValuePath`; system properties are
out of reach. The prefix `data.` is therefore optional and can be omitted.
* The formats/values of the extended properties are extracted from the formats/values of the related index records. If
the formats of the original properties are unknown in the related index records, the indexer will set the value type
of the extended properties as string or string array. (With additional complexity and schema parsing, this limitation
can be overcome, but currently the added value seems to be marginal.)
* If the extended properties are extracted from arrays of objects indexed with
(`"x-osdu-indexing": {"type":"flattened"}`), the indexer cannot re-construct the object properties to the
nested objects when the policy `ExtractAllMatches` is applied. (The kind of indexing is already a deliberate choice.
With additional complexity, this limitation can be overcome, but currently the added value seems to
be marginal.)
* To simplify the solution, all the related kinds defined in the configuration are kinds with major version only. They
must end with dot ".". For example: `"RelatedObjectKind": "osdu:wks:work-product-component--WellLog:1."`.
* Index updates may take time. Immediate consistency cannot be expected.
* When a kind derives extended properties from its parent(s), a new data property `data.AssociatedIdentities` is added
on demand by the indexer. The property name `AssociatedIdentities` is therefore reserved by the Indexer and shall not
be used in any OSDU schemas.
Currently, the property name `AssociatedIdentities` is not in use in any of the OSDU well-known schemas. Tests will be
implemented in the OSDU Data Definition pipeline to ensure that this reserved name does not appear as property in
the `data` block.
[Back to TOC](#TOC)
---
# Change Management
1. Configurations are reference-data and need to be ingested/updated.
2. OSDU Data Definitions must take on the task of defining IndexPropertyPathConfiguration records.
3. Updates (extensions) of index extensions must be managed carefully as they cause re-indexing the kinds involved.
# Decision
# Consequences
* The indexer code changes should have no impact on the system if no IndexPropertyPathConfiguration records are present.
[Back to TOC](#TOC)
---
# ADR Comments BelowM18 - Release 0.21https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/25sdutil cp - to show checksum comparison after completion of copying the file2023-05-18T09:40:40ZDebasis Chatterjeesdutil cp - to show checksum comparison after completion of copying the filePlease consider adding this feature to ensure integrity of data in Seismic Store.
Show checksum of source data file, and the same from the copied file.
Even add the same feature in "sdutil stat". "stat" may also report in bytes units o...Please consider adding this feature to ensure integrity of data in Seismic Store.
Show checksum of source data file, and the same from the copied file.
Even add the same feature in "sdutil stat". "stat" may also report in bytes units of measure.
R3M16/Azure/Preship sdutil -
"**cp**" command (copying the file)
sdutil copy file
```
(sdutilenv) C:\seismic-store-sdutil-master>python sdutil cp C:\TEMP\osdu-volve.segy sd://opendes/debasis/volve.segy
Uploading [========================================] **1104999800/1104999800** [100%] in 12:36.1 (1461432.29/s)
Transfer completed
(sdutilenv) C:\seismic-store-sdutil-master>
```
Source data in local disk
(sdutilenv) C:\seismic-store-sdutil-master>dir C:\TEMP\osdu-volve.segy
Volume in drive C is OS
Volume Serial Number is 62E2-67ED
Directory of C:\TEMP
04/24/2021 04:39 AM **1,104,999,800** osdu-volve.segy
1 File(s) 1,104,999,800 bytes
0 Dir(s) 25,783,111,680 bytes free
(sdutilenv) C:\seismic-store-sdutil-master>
"**stat**" command
```
(sdutilenv) C:\seismic-store-sdutil-master>python sdutil ls sd://opendes/debasis
volve.segy
(sdutilenv) C:\seismic-store-sdutil-master>python sdutil stat sd://opendes/debasis/volve.segy
- Name: sd://opendes/debasis/volve.segy
- Created By: 97pQgJtRFH99Y1KViwFV4GaADxKsIeRG9ZPJ-4PnMb0
- Created Date: Wed Mar 29 2023 00:00:57 GMT+0000 (Coordinated Universal Time)
- **Size: 1.0 GB**
- ReadOnly: False
(sdutilenv) C:\seismic-store-sdutil-master>
```M18 - Release 0.21Debasis ChatterjeeDebasis Chatterjeehttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/20Remove StrEnum from the code2023-06-07T16:37:09ZYan Sushchynski (EPAM)Remove StrEnum from the codeHello,
I think it is possible to delete `StrEnum` from the dependencies, and replace them with something like:
```
class YourStrEnum(str, Enum):
pass
```
The enum from the above behaves the same as StrEnum, so it spares us installi...Hello,
I think it is possible to delete `StrEnum` from the dependencies, and replace them with something like:
```
class YourStrEnum(str, Enum):
pass
```
The enum from the above behaves the same as StrEnum, so it spares us installing extra dependency
More details here:
https://docs.python.org/3.8/library/enum.html#othersM18 - Release 0.21Ashish SaxenaNisha ThakranJeyakumar DevarajuluPriyanka BhongadeAshish Saxenahttps://community.opengroup.org/osdu/platform/data-flow/ingestion/osdu-ingestion-lib/-/issues/9field schema-id replaced by {{data-partition-id}}2023-05-22T11:13:52Zli shuangqifield schema-id replaced by {{data-partition-id}}here use "{{data-partition-id}}" replace schema-id "**{{data-partition-id}}:wks:AbstractWPCGroupType:1.0.0**" is a bug.the first section of schema-id is authority(OSDU).An error is reported during schema validation when we use different ...here use "{{data-partition-id}}" replace schema-id "**{{data-partition-id}}:wks:AbstractWPCGroupType:1.0.0**" is a bug.the first section of schema-id is authority(OSDU).An error is reported during schema validation when we use different partition.
`field.replace("{{data-partition-id}}", self.context.data_partition_id)`
`SURROGATE_KEYS_PATHS = [
("definitions", "**{{data-partition-id}}:wks:AbstractWPCGroupType:1.0.0**", "properties", "Datasets",
"items"),
("definitions", "{{data-partition-id}}:wks:AbstractWPCGroupType:1.0.0", "properties", "Artefacts",
"items", "properties", "ResourceID"),
("properties", "data", "allOf", 1, "properties", "Components", "items"),
]`M18 - Release 0.21https://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/209Testing - Add JUnit test coverage2023-06-07T15:22:49ZJoel RomeroTesting - Add JUnit test coverageAs a GCZ developer, I want to add JUnit tests to increase coverage.
Acceptance Criteria:
- Unit test coverage increased to 80%.
Blockers:
- Access to Maven package waitingAs a GCZ developer, I want to add JUnit tests to increase coverage.
Acceptance Criteria:
- Unit test coverage increased to 80%.
Blockers:
- Access to Maven package waitingGCZ Sprint 40Shanta KattiShanta Kattihttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/issues/78Anthos/Baremetal. (NoSuchKey) when calling "get welllog data"2023-10-04T16:46:26ZYan Sushchynski (EPAM)Anthos/Baremetal. (NoSuchKey) when calling "get welllog data"Hello,
Postman Environment: https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M19/QA_Artifacts_M19/envFilesAndCollections/envFiles/OSDU%20R3%20M19%20RI%20Pre-ship.postman_environment.json Postman Collection: htt...Hello,
Postman Environment: https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M19/QA_Artifacts_M19/envFilesAndCollections/envFiles/OSDU%20R3%20M19%20RI%20Pre-ship.postman_environment.json Postman Collection: https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M19/QA_Artifacts_M19/envFilesAndCollections/Wellbore%20DDMS%20CI-CD%20v3.0.postman_collection.json.
Steps to reproduce:
1. Create a WellLog
2. Post the WellLog data
3. Get the WellLog data.
The logs show that when we post the well log data, a new fodler and a parquet file are created:
```log
DEBUG:Sending http request: <AWSPreparedRequest stream_output=False, method=PUT, url=https://s3.ref.gcp.gnrg-osdu.projects.epam.com/wellbore/logstore-osdu/9ee8ed74df9b8efb695f376771eea3e707b66753/bulk/2c0429ad-b4a1-4a70-a17e-bb08cc245f3f/data/0_4_1691662228355.e70a959cea89c6147785c7fa57cde5be8b6dc250.parquet
```
And then, when we want to get the data, it attempts to get an absent `bulk_catalog.json`:
```log
DEBUG:Sending http request: <AWSPreparedRequest stream_output=True, method=GET, url=https://s3.ref.gcp.gnrg-osdu.projects.epam.com/wellbore/logstore-osdu/9ee8ed74df9b8efb695f376771eea3e707b66753/bulk/2c0429ad-b4a1-4a70-a17e-bb08cc245f3f/data/bulk_catalog.json
```
Linked issue: https://community.opengroup.org/osdu/platform/pre-shipping/-/issues/568M19 - Release 0.22YannickYannickhttps://community.opengroup.org/osdu/platform/pre-shipping/-/issues/577M19 Azure GCZ- Unable to access registered GCZ service in AGOL2023-08-18T07:55:40ZEkta SinghM19 Azure GCZ- Unable to access registered GCZ service in AGOLSteps to replicate.
In AGOL register the item https://osdu-gcz.msft-osdu-test.org/ignite-provider/gcz/FeatureServer/1 .
A new item will be created with a unique item id. Now open this newly created item in a webmap .
Result - item will f...Steps to replicate.
In AGOL register the item https://osdu-gcz.msft-osdu-test.org/ignite-provider/gcz/FeatureServer/1 .
A new item will be created with a unique item id. Now open this newly created item in a webmap .
Result - item will fail to load
Note- Upon checking the url again of the registered item we find the service urls is getting saved ashttps://osdu-gcz.msft-osdu-test.org/ignite-provider/gcz/FeatureServer instead of https://osdu-gcz.msft-osdu-test.org/ignite-provider/gcz/FeatureServer/1M19 - Release 0.22Levi RemingtonLevi Remingtonhttps://community.opengroup.org/osdu/platform/system/dataset/-/issues/56The dataset responds with a 500 server error instead of a DMS Service status ...2023-07-03T14:32:59ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comThe dataset responds with a 500 server error instead of a DMS Service status code.- The underlying DMS service could respond with varying errors and status codes.
- Especially after the EDS-DMS service introduction.
- But the Dataset service will not respect those responses and will try to parse it as an OK response.
...- The underlying DMS service could respond with varying errors and status codes.
- Especially after the EDS-DMS service introduction.
- But the Dataset service will not respect those responses and will try to parse it as an OK response.
Causing parsing errors, and making Dataset response confusing:
~~~
{
"code": 500,
"reason": "Internal Server Error",
"message": "Unrecognized field \"code\" (class org.opengroup.osdu.core.common.dms.model.RetrievalInstructionsResponse), not marked as ignorable (one known property: \"datasets\"])_ at [Source: (String)\"{\"code\":401,\"reason\":\"Access denied\",\"message\":\"The user is not authorized to perform this action\"}\"; line: 1, column: 12] (through reference chain: org.opengroup.osdu.core.common.dms.model.RetrievalInstructionsResponse[\"code\"])"
}
~~~
Solution:
- check the DMS response code, in case it's not ok do not try to parse it, instead respond with it to the user, highlighting that the error occurred in DMS. <br/>
Example:
~~~
{
"code": 403,
"reason": "Non-OK response received from DMS service: https://community.gcp.gnrg-osdu.projects.epam.com/api/file/v2/files/storageInstructions",
"message": "RBAC: access denied"
}
~~~M19 - Release 0.22Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/system/schema-service/-/issues/130Schema service whitesource issue2024-01-11T11:53:42ZSudesh TagadpallewarSchema service whitesource issueThere are vulnerabilities exist in schema service. I have fixed vulnerabilities relevant libraries. I have skipped upgrading libraries which needs Java 17+. Right now pipeline is failing at gc-deploy and gc-baremetal-deploy. We need to f...There are vulnerabilities exist in schema service. I have fixed vulnerabilities relevant libraries. I have skipped upgrading libraries which needs Java 17+. Right now pipeline is failing at gc-deploy and gc-baremetal-deploy. We need to fix these vulnerabilities.
MR link - https://community.opengroup.org/osdu/platform/system/schema-service/-/merge_requests/504
![image](/uploads/dfa39bf12a986d682cf04e05da9e4896/image.png)M19 - Release 0.22vikas ranavikas ranahttps://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/issues/67inefficient/non-performant crs-conversions causing reliability/performance is...2023-06-21T07:54:16ZYurii Kondakovinefficient/non-performant crs-conversions causing reliability/performance issues on ingestion workflowWe are seeing reliability and performance issue because of inefficient/non-performant crs-conversions on ingestion workflow. If crs conversions takes long time (on Storage /batch API), then it slows down entire ingestion workflow.
There...We are seeing reliability and performance issue because of inefficient/non-performant crs-conversions on ingestion workflow. If crs conversions takes long time (on Storage /batch API), then it slows down entire ingestion workflow.
There is a need for setting a timeout for crs-conversion requests that run for more than a certain time. For the requests to crs-conversion-service currently java.net.HttpURLConnection class is used, which is only has connectionTimeout and readTimeout properties, that don't help us to set the timeout.
It is suggested to use apache CloseableHttpClient httpClient that has socketTimeout property.
core-common MR - https://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/merge_requests/213
storage MR - https://community.opengroup.org/osdu/platform/system/storage/-/merge_requests/712M19 - Release 0.22Yurii KondakovYurii Kondakov