OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2023-06-15T17:22:20Zhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/15requirements.txt in Deployment scripts2023-06-15T17:22:20ZJeyakumar Devarajulurequirements.txt in Deployment scriptsThere are a lot of packages added in the requirement.txt folder, Does all are used for deployment and it requires cleaning as there are a lot of outdated versions as well and which may be vulnerable
https://community.opengroup.org/osdu/...There are a lot of packages added in the requirement.txt folder, Does all are used for deployment and it requires cleaning as there are a lot of outdated versions as well and which may be vulnerable
https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/blob/master/deployments/scripts/azure/requirements.txt
@thulasi_dass @shivani_karipe @Srinivasan_Narayanan
CC: @AshishSaxenaAccenturehttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/well-delivery/well-delivery/-/issues/16New "consumption" API to show combined information from multiple Drilling Rep...2023-02-02T23:02:15ZDebasis ChatterjeeNew "consumption" API to show combined information from multiple Drilling ReportsThis refers to OperationsReports entity and its sub-types such as GasReading, PumpOp, OperationsActivity.
https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/E-R/master-data/OperationsReport.1.2.0.md
I had prelimina...This refers to OperationsReports entity and its sub-types such as GasReading, PumpOp, OperationsActivity.
https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/E-R/master-data/OperationsReport.1.2.0.md
I had preliminary discussion with @openai and Stuart about this requirement.
Also see this deck for an understanding.
[Andrei-expected-flow.pptx](/uploads/3010f7d5b476931b55e23c2357377959/Andrei-expected-flow.pptx)
There are some API services available to deal with this data type.
GET /operationsReports/v1/byTimeRange/{start-time}/{end-time}
But these return only IDs and then end user (or vendor's application) will have to run extra mils to extract required information from multiple records.
What is needed is user friendly (new) service returning "packaged" information about requested sub-type from multiple Drilling reports (hence multiple records of OperationsReport) for the requested time range.https://community.opengroup.org/osdu/platform/system/search-service/-/issues/114Inconsistency status codes when user has no access2023-06-15T11:10:54ZMarton NagyInconsistency status codes when user has no accessWhen I requested a search at url "https://{baseURL}/api/search/v2/search" with headers like:
```
[Authorization, Bearer ey...]
[Accept, application/json]
[data-partition-id, slb]
```
and body: `{"kind":"*:*:*:*","limit":100,"...When I requested a search at url "https://{baseURL}/api/search/v2/search" with headers like:
```
[Authorization, Bearer ey...]
[Accept, application/json]
[data-partition-id, slb]
```
and body: `{"kind":"*:*:*:*","limit":100,"query":"(kind:osdu\\:wks\\:master-data--Wellbore\\:*) AND (\"mnagy-12\" )","queryAsOwner":false,"offset":0}`
on an environment where **there is no "slb" data partition**, or at least I have no access to that.
I've got result: `{"code":401,"reason":"Access denied","message":"The user is not authorized to perform this action"}`
And the next query was successful when the data-partition-id header was changed to a valid data partition for which I have access.
Instead of 401 I would think a 403 - Forbidden would much more clear, as 401 usually means "I don't know who you are", and 403 "I know who you are, but you cannot do that".https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/14EDS : Refer secret service in eds_scheduler from eds_ingest2023-02-21T16:39:23ZNisha ThakranEDS : Refer secret service in eds_scheduler from eds_ingest- Remove the duplicate secret client added in eds_scheduler dag.
- Import the secret service client from eds_ingest.- Remove the duplicate secret client added in eds_scheduler dag.
- Import the secret service client from eds_ingest.M16 - Release 0.19Nisha ThakranJeyakumar DevarajuluNisha Thakranhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/113Enhancement for a new way to apply Search service policy rules - metadata res...2023-02-17T09:11:57ZDadong ZhouEnhancement for a new way to apply Search service policy rules - metadata restrictionFrom Policy side requirement: https://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/64
From recent internal meetings/discussions, we learnt a new requirement: we would like to use the policy rules to cont...From Policy side requirement: https://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/64
From recent internal meetings/discussions, we learnt a new requirement: we would like to use the policy rules to control what metadata fields the Search service will return back to the user based on the data record legal tags and the login user attributes (ie country etc). In this scenario, the user may not have access to the data record (permission controlled by the Storage policy rules) but will allow the user to search for the data record with limited visibility of the metadata. See if this is possible for a future enhancement.M17 - Release 0.20https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/169SEGYImport documentation incomplete/inconsistent (legal values, default values)2023-01-23T12:28:35ZAlexander JaustSEGYImport documentation incomplete/inconsistent (legal values, default values)The comments here refer to the documentation of `SEGYImport` and the deep dive. I found some things that are inconsistent or easy to overlook depending on how one works with the tools provided by OpenVDS.
## CLI help
I find that docume...The comments here refer to the documentation of `SEGYImport` and the deep dive. I found some things that are inconsistent or easy to overlook depending on how one works with the tools provided by OpenVDS.
## CLI help
I find that documentation of `SEGYImport` in the terminal is lacking important information. I tend to use the terminal a lot and thus also use `SEGYImport --help` frequently.
The documentation of valid input values is a bit inconsistent. For some options (attribute name, attribute unit...) the output shows the accepted options.
```
...
--attribute-name <string>
The name of the primary VDS channel. The
name may be Amplitude (default), Attribute,
Depth, Probability, Time, Vavg, Vint, or
Vrms (default: Amplitude)
--attribute-unit <string>
The units of the primary VDS channel. The
unit name may be blank (default), ft, ft/s,
Hz, m, m/s, ms, or s
...
```
However, for "brick size" and "level of detail" levels the limits are not mentioned:
```
...
-b, --brick-size <value> The brick size for the volume data store.
--lod-levels <value> The number of LODs to generate.
...
```
When digging a bit deeper in the, [developer documentation](https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/cppdoc/namespace/namespaceOpenVDS.html#_CPPv4N7OpenVDS26VolumeDataLayoutDescriptor9BrickSizeE) might even give the expectation that one should be allowed to use larger brick sizes that currently allowed. I assume that brick sizes >256 are only useful for 2D datasets. For level of detail levels, it is explained in the deep dive that the value can be at at least 0 and at most 12.
Additionally to that, neither does the [documentation](https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/vds/deepdive/deepdive.html#brick-size) directly mention that the brick size has to be (certain) powers of two.
The output neither states the default value of the number of LODs. I expected that it would be zero as there is no general recommendation for number of levels in the deep dive (see also comment below).
The output of `SEGYImport --help` also deviates slightly from the [documentation on the homepage](https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/tools/SEGYImport/README.html). I am not sure if it would be possible to synchonize the content of the README with the actual `SEGYImport` output.
There might be further options with incomplete documentation.
## Deep dive
### Margin
It is unclear to me what the correct margin size to chose is and what the actual default is without looking into the source code of `SEGYImport`. In the [deep dive](https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/vds/deepdive/deepdive.html#margin-size) it is mentioned that a margin of 4 should be used as default and that the value is important when working with level of details. No connection to wavelets is mentioned.
In the [README](https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/tools/SEGYImport/README.html) it mentions that the default value for the margin is 0 and 4 if one uses wavelet compression. Checking the source code confirms the statement of the README.
### Level of detail
It is explained when one should use level of details, but no general recommendation for how many level of detail levels one should generate is given (except if one wants to use FAST). However, `SEGYImport` chooses 2 as general default and 4 for poststack data.https://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/issues/64Spring configuration related to Feature flag should be optional2023-01-30T11:37:15ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comSpring configuration related to Feature flag should be optionalIt is possible that some OSDU services may not use Partition service, Unit service as an example. But the configuration of the recently merged code https://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/merge_requ...It is possible that some OSDU services may not use Partition service, Unit service as an example. But the configuration of the recently merged code https://community.opengroup.org/osdu/platform/system/lib/core/os-core-common/-/merge_requests/189 requires PARTITION_API env variable mandatorily. It will be better to change the configuration to make it optional, instead of bringing this variable to each service, no matter if Partition is used or not.
~~~
Exception encountered during context initialization - cancelling refresh attempt:
org.springframework.beans.factory.BeanCreationException:
Error creating bean with name 'featureFlagConfig':
Injection of autowired dependencies failed; nested exception is java.lang.IllegalArgumentException:
Could not resolve placeholder 'PARTITION_API' in value "${PARTITION_API}"
~~~Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRiabokon Stanislav(EPAM)[GCP]Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/13EDS -M16 Testing in GLAB2023-02-15T08:32:19ZNisha ThakranEDS -M16 Testing in GLAB
- [ ] Validate Schemas
- [x] osdu:wks:master-data--ConnectedSourceRegistryEntry:1.2.0
- [x] osdu:wks:master-data--ConnectedSourceDataJob:1.3.0
- [x] osdu:wks:reference-data--OAuth2FlowType:1.0.0
- [x] osdu:wks:refer...
- [ ] Validate Schemas
- [x] osdu:wks:master-data--ConnectedSourceRegistryEntry:1.2.0
- [x] osdu:wks:master-data--ConnectedSourceDataJob:1.3.0
- [x] osdu:wks:reference-data--OAuth2FlowType:1.0.0
- [x] osdu:wks:reference-data--SecuritySchemeType:1.0.0
- [ ] Validate ReferenceData
- [x] osdu:wks:reference-data--OAuth2FlowType:1.0.0
- [x] osdu:wks:reference-data--SecuritySchemeType:1.0.0
**For Using Activity Parameters**
- [x] Add ParameterKind
- [x] Add ParameterRole
- [x] Add ExistenceKind
- [x] Add Activity Template CSRE
- [x] Add Activity Template CSDJ
**EDS:Mutiple Work product component ingestion(https://community.opengroup.org/osdu/platform/pre-shipping/-/issues/436)**
- [x] Create CSRE
- [x] Create CSDJ
- [x] fetched single WPC with single dataset id associated with it.
- [x] fetched Mutiple WPC Ids with single dataset id associated with each of it.
- [x] fetched mutiple WPC with mutiple dataset ids(2) associated with each of them.
- [x] fetched mutiple WPC Ids with failed ids
**Changes in ConnectedSourceDataJob and ConnectedSourceRegistryEntry for Workflow Paramter(https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/11)**
- [x] create CSRE with activity parameters DatasetURL and SearchURL
- [x] keep workflow in CSDJ as of now
- [ ] Test if Search API is fetched using activity parameter.
- [x] for master data entity
- [x] for well product component
- [x] Eds ingest working fine.
**EDS - Keep some identification/flag in created records(https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/10)**
- [x] create a reference value for the Data.SourceOrganisationId(opendes:master-data--Organisation:AWS-PRESHIP:)
- [x] create CSRE having above value
- [x] test if data.Source is replaced with the above value
- [ ] for master data entity
- [ ] for work product component entity
**EDS Ingest : Provide Start Date Time and End Date Time to fetch the records between the span.(https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/12)**
- [x] create CSRE
- [x] create CSDJ with FetchStartDateTime and FetchEndDateTime
- [ ] Test with following scenarios
- [x] both the variables empty
- [x] both the variables have datetime given
- [x] only FetchStartDateTime is given
- [x] only FetchEndDateTime is given
**EDS : Refer secret service in eds_scheduler from eds_ingest(https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/14)**
- [x] Remove secret service client.py from eds_scheduler
- [x] import secret service client from eds_ingest
- [x] import is successful.M16 - Release 0.19Nisha ThakranNisha Thakranhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/12EDS Ingest : Provide Start Date Time and End Date Time to fetch the records b...2023-01-19T11:48:20ZNisha ThakranEDS Ingest : Provide Start Date Time and End Date Time to fetch the records between the span.- Provide the start date time and end date time to fetch the records between the given datetime span by modifying the Query build by the eds ingest
` {'kind': 'osdu:wks:master-data--Well:1.0.0', 'query': '(*) AND ((createTime: [2020-01-...- Provide the start date time and end date time to fetch the records between the given datetime span by modifying the Query build by the eds ingest
` {'kind': 'osdu:wks:master-data--Well:1.0.0', 'query': '(*) AND ((createTime: [2020-01-01T00:00:00 TO 2023-01-18T06:12:41]) OR (modifyTime: [2020-01-01T00:00:00 TO 2023-01-18T06:12:41]))', 'sort': {'field': ['createTime'], 'order': ['ASC']}, 'limit': 2}`M16 - Release 0.19Nisha ThakranPriyanka BhongadeNisha Thakranhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/112Azure search service can have high latencies from entitlements2023-02-17T09:06:32Zashley kelhamAzure search service can have high latencies from entitlementsThe search service from Azure can occasionally have high latencies when doing the API check against entitlements and the response is slow. Unlike most other CSPs and on other services search doesnt appear to cache this response from ent...The search service from Azure can occasionally have high latencies when doing the API check against entitlements and the response is slow. Unlike most other CSPs and on other services search doesnt appear to cache this response from entitlements to reuse negating this impact.
We would like to add an azure version of IAuthorizationService with a cache into search to help.M16 - Release 0.19ashley kelhamashley kelhamhttps://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/issues/118[Azure] Unnecessary cronjob increasing latency2023-02-20T17:07:15ZThiago Senador[Azure] Unnecessary cronjob increasing latencyWe have a [cron job](https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/provider/entitlements-v2-azure/src/main/java/org/opengroup/osdu/entitlements/v2/azure/service/PartitionCacheTtlService....We have a [cron job](https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/provider/entitlements-v2-azure/src/main/java/org/opengroup/osdu/entitlements/v2/azure/service/PartitionCacheTtlService.java#L95) running every 5 minutes just to update some ttl values from Partition service. Since we don't need to update such values at runtime this cronjob is unnecessary, and more than that, we strongly believe it is causing a significant multi-threading overhead to the service: after analyzing some entitlements requests extremely slow (latency > 1 minute), we realized some threads hanging around the same time the cronjob executes. As an empirical analysis, I removed the cronjob and its related config/annotations. After a couple of days running the entitlements version without the cronjob I noticed improvements in 99 request latency percentile as well as the absence of those lengthy requests.
FYI, some useful references related to the issue.
- the [`@EnableAsync`](https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/provider/entitlements-v2-azure/src/main/java/org/opengroup/osdu/entitlements/v2/azure/EntitlementsV2Application.java#L16) tag is misplaced. It should be used in a [`@Configuration`](https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/scheduling/annotation/EnableAsync.html) class and not in an application level.
- the [`@EnableScheduling`](https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/provider/entitlements-v2-azure/src/main/java/org/opengroup/osdu/entitlements/v2/azure/EntitlementsV2Application.java#L17) tag is misplaced. It should be uses as localized as possible (method level) and not at application level.
Can we completely eliminate this cronjob from Azure deployment?M17 - Release 0.20Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/210Add Symbology to Koop Service2023-01-17T17:32:00ZJoel RomeroAdd Symbology to Koop Servicehttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/209Testing - Add JUnit test coverage2023-06-07T15:22:49ZJoel RomeroTesting - Add JUnit test coverageAs a GCZ developer, I want to add JUnit tests to increase coverage.
Acceptance Criteria:
- Unit test coverage increased to 80%.
Blockers:
- Access to Maven package waitingAs a GCZ developer, I want to add JUnit tests to increase coverage.
Acceptance Criteria:
- Unit test coverage increased to 80%.
Blockers:
- Access to Maven package waitingGCZ Sprint 40Shanta KattiShanta Kattihttps://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/issues/117[Azure] Loading dependencies eagerly2023-11-27T15:40:54ZThiago Senador[Azure] Loading dependencies eagerlyThe current Azure entitlements deployment loads dependencies such as Redis and CosmosDB in a lazy manner, penalizing every first request with huge latencies. These slow requests impact our SLI/SLOs as well as other services and applicati...The current Azure entitlements deployment loads dependencies such as Redis and CosmosDB in a lazy manner, penalizing every first request with huge latencies. These slow requests impact our SLI/SLOs as well as other services and applications.
I already have a (tested) feature branch in which I changed these dependencies to be loaded at service startup (eager loading), eliminating these slow requests. Can we move forward and merge these changes?M16 - Release 0.19Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/111Issue with search when it concerns custom schema entity2023-01-30T04:45:42ZDebasis ChatterjeeIssue with search when it concerns custom schema entityReported by Rex from bld.ai
Hello experts, my name is Rex from bld.ai currently working for BHP. Currently, we're having issues regarding the fields that's showing up from the osdu search endpoint. We're expecting all the fields that we...Reported by Rex from bld.ai
Hello experts, my name is Rex from bld.ai currently working for BHP. Currently, we're having issues regarding the fields that's showing up from the osdu search endpoint. We're expecting all the fields that were registered on our custom schema to show up in every record, however, some of the fields/property are missing. Upon observation this is what I've noticed:
if the data type of the particular field is "string", the field will show up in the search endpoint irregardless if that particular field is included in the ingested record or not <- this is the behavior we want, (field "A" would show as "A": None if it's not part of the ingested record).
however, if the data type of the field is non-string, it will only show up in the search endpoint if the field is part of the ingested record
We want the fields to still show up even if we're working with non-string data type. Setting the all the fields to a "string" datatype may be potential solution, however, we might lose some search functionalities that are only true for the intended data type. I've tried ways that I've found from osdu documentation such as anyOf, oneOf, etc. to no avail. Perhaps some of you might have something that can help us?
cc @chadhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/109Spatial filter with longitude out of range [-180;180] considered as invalid2023-10-02T13:38:59ZYauheni LesnikauSpatial filter with longitude out of range [-180;180] considered as invalidIf a rectangle spatial filter larger than 180 on longitude, and the rectangle across the international date change line, the following behavior is observed:
Longitude of right side of the rectangle will be converted from "-165" to "195"...If a rectangle spatial filter larger than 180 on longitude, and the rectangle across the international date change line, the following behavior is observed:
Longitude of right side of the rectangle will be converted from "-165" to "195"and search API gives 400 error, because it constraints the longitude to be within [-180,180]
`{"field":"data.SpatialLocation.Wgs84Coordinates","byGeoPolygon":{"points":[{"longitude":5.648519075025133,"latitude":47.33389841046996},{"longitude":5.648519075025133,"latitude":-80.31395034614398},{"longitude":194.6133628249749,"latitude":-80.31395034614398},{"longitude":194.6133628249749,"latitude":47.33389841046996},{"longitude":5.648519075025133,"latitude":47.33389841046996}]}}}`
I would suggest to relax the validation to range [-360, 360]. Because the it is difficult to guarantee the efficiency of the changes for all Elastics in all CSPs, it is recommended to use some feature flag. Each CSP could switch on and test this validation on one's own.Yauheni LesnikauYauheni Lesnikauhttps://community.opengroup.org/osdu/platform/system/home/-/issues/100ADR - Update Info API and Git tags to support core and CSP specific implement...2023-12-18T15:25:28ZOm Prakash GuptaADR - Update Info API and Git tags to support core and CSP specific implementation version## Decision Title
Update Info API to support core and CSP specific implementation version
## Summary
This proposes to directly support independent provider releases of services will reduce the amount of effort spent re-deploying
log...## Decision Title
Update Info API to support core and CSP specific implementation version
## Summary
This proposes to directly support independent provider releases of services will reduce the amount of effort spent re-deploying
logic to provider implementations that are unaffected by the fixes / patches.
This requires changing procedures around build configuration, release management, and reporting via the `/info` endpoint.
## Status
_Note that this ADR was originally proposed as #98, but was re-written to include details that emerged during the discussion and
approval process._
- [x] Proposed
- [x] Approved
- [x] Implementing (incl. documenting)
- [ ] Testing
- [ ] Released
## Context
Most of the services in the OSDU Data Platform use a Service Provider Interface abstraction.
This abstraction allows for the bulk of the service logic and implementation of the external API to be done in a common core
library, yet allow for individual service providers to implement the backend logic differently.
To this point, these various libraries (the core and each of the provider implementations) have been kept version synchronized --
exactly identical versions.
During the release process, the Git tag has reflected that common version to enable easy lookup of the code used.
And, for runtime inspection, a special endpoint (`/info`) reports back the version, commit SHA, and several other pieces of information.
## Problem statement
Sometimes new patches -- typically bug fixes / security upgrades -- apply only to one particular provider implementation.
When coupled with the common synchronized version, this causes the core library version (as well as all other provider
implementations) to change versions to match -- even though no changes occurred in these libraries.
Moreover, when the tags are made for the patch, other providers appear to be out of date.
This puts those providers in the position to choose between redeploying code that is functionally equivalent or leaving existing
versions and trying to document that the upgrade had no effect on their logic.
## Proposed solution
### Scope Limitations
**Patches Only**
This proposal limits provider-specific releases to patches only -- no support is provided for a minor (feature additive) or major
(breaking) change.
This limitation reduces the complexity of the release strategy immensely, and covers most use cases well.
**Maven Focused**
The proposal was designed with core services in mind, which are predominantly build using Java code and a Maven build system.
The strategies implement well in these cases, but may be extensible to other build systems in different ways.
To maintain simplicity, this ADR will only refer to projects that use Maven, and have an SPI abstraction layer.
**Services Only**
This proposal does not apply to libraries or utilities.
### Side Effect
**No Longer Proper Semantic Version Style**
Tags will now have extra information that doesn't exactly match the semantic versioning style.
In cases where tags have 4 numeric parts, though, they can be directly compared left to right as you would expect.
For example:
* `1.10.2-ABC.4` < `1.10.2-ABC.5`
* `1.10.2-ABC.4` < `1.10.3`
* `1.10.2-ABC.4` < `1.11.0`
* `1.10.2-ABC.4` < `1.11.0-ABC.2`
The addition of the text element identifying the provider implementation is the biggest variation from a standard semantic version.
Also, the provider patch number will always be greater than the core patch version (there cannot be a `1.10.3-ABC.2`, for instance).
The `/info` endpoint will consist of two separate version numbers, each of which is semantic style in itself.
However, the two versions are related and constrained by each other.
If the core version is `X.Y.N` and the provider is `X.Y.M`, then you know the service is for the service group version `X.Y`, there
have been `N` patches made to the core code, and `M - N` patches made to the provider logic.
Converting between the core + provider version pair and the Git tag will not be possible in general, but will be computable in
specific circumstances (such as the matching pair case -- `[X.Y.N, X.Y.N]`), and can also be converted using a list of all tags or
knowledge of the order in which patches were applied.
As an example, the pair `[X.Y.1, X.Y.2]` is ambiguous.
If the core patch came first, then the final tag would be `vX.Y.1-ABC.2`.
But, if the provider patch came first, then the final tag would be `vX.Y.1`.
### Build Configuration
**Summary**
* Provider-based POMs get a separate provider version, and the rest use a core version.
* Major + Minor will match between them
* Changes to a provider increments only the provider version
* Changes to core code increments ALL versions (core + all providers)
**Details**
The services are designed to have two distinct POM "trees", i.e. parent POMs plus a set of child modules.
One parent POM is stored in the repository root (for the main code), and one is in the `testing/` folder (for integration tests).
Each tree has a set of common code, which are organized as a code library and stored in `service-code` and
`testing/service-test-core` (where `service` is replaced with the specific service name).
Then, specific providers that have implementations show up in `providers/service-abc` and `testing/service-test-abc` (where, again,
`service` is replace with the service name and `abc` is replace with the provider name).
Both of the parent POMs and both of the core libraries should be synchronized to have identical version numbers, matching the
overall version of the service.
Then, for each provider implementation, the provider implementation and the provider testing library will be synchronized among
themselves.
This leads to a layout something like this:
```
+ pom.xml (CORE)
+ service-core/pom.xml (CORE)
+ providers
+ service-abc/pom.xml (ABC)
+ service-xyz/pom.xml (XYZ)
+ testing
+ pom.xml (CORE)
+ service-test-core/pom.xml (CORE)
+ service-test-abc/pom.xml (ABC)
+ service-test-xyz/pom.xml (XYZ)
```
Each file here is marked with the artifact version that it would use (by "artifact version", I mean `project.version` in the POM file).
Notice that the parent poms and the core libraries are all synchronized to use the same, base version of the service.
Each provider gets their own version, but will be synchronized between their main library and testing library.
Since only patches are permitted for provider specific versions, the first two version components will always match.
**Versioning Responsibility**
The Release Coordinators will manage all version changes as part of the release process.
MRs should not modify the artifact versions of the various POM files, nor the dependency versions on the self-built core libraries
-- that is, the core library for the same service.
They can, however, modify first-party or third-party library versions that come from outside sources, including os-core-common or a
provider's core common library.
`SNAPSHOT` library versions will still be used, so all artifact versions in all POM files will have a version ending in `-SNAPSHOT`.
On the default branch (`master` / `main`), the both the core versions and all provider versions will be set to the same value, since
all development here is occurring within the milestone.
The versions won't start to differentiate until a release is made -- and then will only differ on the release branch and tags.
**Every provider must use the latest core**
When a patch is made to the core library, it will be automatically applied to all providers.
We will not have any mechanism to skip a core patch for a particular provider or have a change made in the core library only apply
to a single provider.
Any change to the `service-core` code, everybody's version bumps and everybody redeploys.
This is not expected to be an onerous requirement -- patches are by their nature unlikely to cause compatibility issues.
If a particular patch would cause a lot of effort for a provider, that's a good indication that it isn't really a patch, but instead
a new feature that should wait for the next full milestone.
### Info Endpoint
**Summary**
* Each service's `/info` endpoint will need to report two different values -- the provider version and the core version
**Details**
All the core services have a special endpoint that returns information about the service, including the artifact version, the Git
commit ID, and more.
This endpoint will need to be modified to report two different fields -- `coreVersion` and `providerVersion` instead of just
`version`.
For example, you may see a return such as this:
```json
{
"groupId": "org.opengroup.osdu.indexer",
"artifactId": "indexer-azure",
"coreVersion": "1.10.0",
"providerVersion": "1.10.2",
"buildTime": "2023-01-17T12:30:00.500Z",
...
}
```
This work can be done independently from the release script / tagging changes.
Until complete, the reported version will be the core version only, which is acceptable during the transition.
### MR Procedure Changes
**Summary**
* Labels for providers (such as ~AWS, ~Azure, ~GCP, ~IBM) and for core code ( ~"Common Code") must be used in every MR
**Details**
These labels are already in widespread use to help teams determine proper reviewers for code.
This proposal extends this to be used automatically by release scripts to determine whether a particular MR cherry-pick should
increment the core version or a particular provider version.
### Release Process Changes
**Summary**
* Changes to provider code increments the provider patch number
* Changes to core code increments all patch numbers
* Tags with only provider changes have the core version and the provider name / patch number -- `vX.Y.N-ABC.M`
* Tags with core changes have only the core version -- `vX.Y.N`
**Details**
The default branches will have all versions (core and provider) set to be the version being built for the upcoming milestone,
similar to how it is done currently.
Release branches will begin with all versions the same, but can evolve over time based on the kinds of patches that come in.
Here's an example flow that shows a couple of different kinds of patches.
_For purposes of example, this assumes that the release is being prepared for version `1.10` of the Core Service Group._
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T1 | Create Release Branch | Branch, release/1.10 | 1.10.0-SNAPSHOT | 1.10.0-SNAPSHOT | 1.10.0-SNAPSHOT |
| T2 | Create Tag | Tag, v1.10.0 | 1.10.0 | 1.10.0 | 1.10.0 |
| T3 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
**T1 -- Create Release Branch**:
This is the very first step, where the `release/1.10` branch is created for the first time.
The core and provider versions will match, and will be inherited from the default branch.
Immediately after this step, the default branch would be prepared for the next release, setting versions to `1.11.0-SNAPSHOT`.
**T2 -- Create Tag**:
At this point, the release branch has been tested, and any lingering work has merged in.
Now, we create a "release commit" -- a commit based on the release branch that alters the artifact versions, removing the
`SNAPSHOT` part.
That commit is tagged, and since it is the initial release, it gets the simple `v1.10.0` tag.
**T3 -- Prepare for Next Patch**:
Once the tag is made, the release branch immediately prepares for the next patch.
Versions of core and provider POMs are incremented to the next patch numberu (but kept as a `SNAPSHOT`) -- we are "building towards"
version `1.10.1`, therefore it is `1.10.1-SNAPSHOT`.
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T3 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
| T4 | Cherry-pick Provider ABC Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT | 1.10.1-SNAPSHOT |
| T5 | Create Tag | Tag, v1.10.0-ABC.1 | 1.10.0 | 1.10.1 | 1.10.0 |
| T6 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
**T4 -- Cherry-pick Provider ABC Patch**:
In this step, we imagine a cherry-pick MR that patches only provider "ABC".
We approve and merge that MR into the release branch, but version numbers don't immediately change as part of that operation.
**T5 -- Create Tag**:
Once we're happy with the integration test results for the branch, we repeat the step to create the tag.
A new commit is made, dropping all the `SNAPSHOT` suffixes from the versions.
But crucially, if the particular library has not had any code changes then the version is set to match the previous release tag.
In this case, we see that "Core" and "XYZ" are set to `1.10.0`, even though the release branch the tag was created from was
`1.10.1-SNAPSHOT`.
This is a counterintuitive consequence of using a single release branch to work on several different libraries at the same time.
Then, the tag is created using the core version first, then the provider name, then the patch number of the provider version.
We don't need the full provider version, because we know that the major and minor numbers will match.
In this case, `v1.10.0-ABC.1` because it is an "ABC" patch, and ABC's version is `1.10.1`.
**T6 -- Prepare for Next Patch**:
Again, we prepare the release branch, incrementing the version for provider ABC to `1.10.2-SNAPSHOT`.
Other versions can remain at `1.10.1-SNAPSHOT` -- they are still "building towards" their first patch.
| Time | Event | Branch / Tag | Core Version | Provider ABC Version | Provider XYZ Version |
| ---- | ------------------------------ | -------------------- | --------------- | -------------------- | -------------------- |
| T6 | Prepare for Next Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
| T7 | Cherry-pick Core Patch | Branch, release/1.10 | 1.10.1-SNAPSHOT | 1.10.2-SNAPSHOT | 1.10.1-SNAPSHOT |
| T8 | Create Tag | Tag, v1.10.1 | 1.10.1 | 1.10.2 | 1.10.1 |
| T9 | Prepare for Next Patch | Branch, release/1.10 | 1.10.2-SNAPSHOT | 1.10.3-SNAPSHOT | 1.10.2-SNAPSHOT |
**T7 -- Cherry-pick Core Patch**:
In this step, we imagine a cherry-pick MR that patches the core code.
We similarly approve and merge that MR into the release branch, and version numbers remain as they are.
**T8 -- Create Tag**:
Now that we're happy with the code, we create a tag commit.
In this commit, every version is incremented from the last release (effectively just remove the `SNAPSHOT` part).
With this commit, we tag it with the core version only -- in this case `v1.10.1`.
The ABC provider library is at version `1.10.2` here -- it's been patched twice, once by itself and once as a consequence of
patching the common code.
But, that information doesn't show up directly in the tag spelling, instead it will be in the `/info` endpoint, the code, and can
also be deduced from the list of previous tags.
We do it this way to avoid needing to make multiple tags per release, explicitly named each pairwise combination.
It is hoped that provider patches are rare enough that it is easier / better to use a simple single tag for all providers.
**T6 -- Prepare for Next Patch**:
One more time, we prepare the release branch, incrementing the version for all components to their next `SNAPSHOT` version.
**Final Results**
After all this, the ABC provider will see three tags: `v1.10.0`, `v1.10.0-ABC.1`, and `v1.10.1`.
All other providers will see two: `v1.10.0` and `v1.10.1`.M17 - Release 0.20Om Prakash GuptaOm Prakash Guptahttps://community.opengroup.org/osdu/platform/deployment-and-operations/multi-region-deployment/multi-region/-/issues/13Federated Search PoC Design for Multi-Region OSDU2023-08-15T07:55:35ZWan Ahmad Zaki Wan Mohammad NoorFederated Search PoC Design for Multi-Region OSDU# Background
Multi-Region OSDU has long been an important yet complex community initiative since it was first introduced in Feb 2020 in this ADR. It was brought back to community focus in the June 2022 EA sub-committee F2F meeting. Since...# Background
Multi-Region OSDU has long been an important yet complex community initiative since it was first introduced in Feb 2020 in this ADR. It was brought back to community focus in the June 2022 EA sub-committee F2F meeting. Since August of 2022, PETRONAS initiated the conversation with MSFT and SLB to collaborate and started brainstorming and researching requirements and feasible solutions. In the meantime, Shell has done a thorough and wide-reaching user voice gathering and produced a comprehensive and detailed “Multi Region Use Cases” document. Based on these efforts, MSFT has created a phased design proposal for multi-region OSDU that aligns with customer requirements as closely as possible and can be tackled in an incremental fashion. The outcome and test results from a prior phase PoC can determine whether a next phase is needed, and which direction should be taken in the next phase if needed. Once a PoC with acceptable SLA is reached, the implementation can begin. This phased approach will optimize the efforts and outcome.
# Scope and Purpose
This doc is intended to be used as a guide for the PoC work for the first phase, Federated Search approach. It includes detailed descriptions of the Federated Search approach, what new APIs are needed and how to implement them, what tests are needed, and what performance data will be collected to come up with estimated latency numbers under typical use scenarios. These numbers will determine whether the Federated Search approach meets user expectations. If it does, it can be implemented with the lowest cost compared with other approaches and deliver benefits to customers quickly. If not, we can move on to the next architectural model. Even if it is determined to be not meeting user expectations, the PoC work will be useful to pinpoint bottlenecks and gaps and establish baseline latency data for the next phase implementation.
# Federated Search Details
![image](/uploads/8336bc57d788888612ca0460c09a3dd2/image.png)
The Federated Search approach builds on top of the current OSDU implementation and requires no change to the existing APIs in the existing OSDU services. Several new federated search APIs will be added to Search service, Storage service, Well Delivery DDMS and Wellbore DDMS, and a new multi-region configuration service will be created for multi-region administration and configuration tasks. Three of the new APIs are in scope of the PoC. The new configuration service is not in scope of the PoC; manual configuration will be done instead.
### How Federated Search Works
**Deployment**
A global OSDU Administrator deploys multiple instances of OSDU to multiple CSP regions, each instance with its own data partitions. Data ingestion works the same as today. A data record only exists in the OSDU instance in the home region where the data is ingested. There is no raw data replication or catalog data replication or search index replication between instances.
**Region and Group Configuration**
The Administrator creates a global “Regions” table to hold all deployed OSDU instances’ IDs, endpoints, and data partition names. The Administrator configures “Cross-Region Partition Groups” to specify which partitions are logically associated to form a cross region search group. Multiple of such groups can be configured. The partition ID in the “Cross-Region Partition Groups” is in the format of “Instance ID: Partition Name” to guarantee its uniqueness. The configuration APIs will be included in the new Multi-region Configuration Service in actual implementation. In PoC, the configuration will be done by manually creating two Cosmos DB tables.
**Two New Global Search APIs and One New Global Storage Query API**
A new “POST /api/global_search/v2/query” and “POST /api/global_search/v2/query_with_cursor” APIs will be added to the core Search Service. They are the counter APIs for the current local versions of the “POST /api/search/v2/query” and “POST /api/search/v2/query_with_cursor” APIs that only search in a local partition. The Global Search APIs conduct search in a “Cross-Region Partition Group” across multiple regions.
In Storage Service, there are a set of four query APIs that retrieve data records in a local partition. Four global versions of these query APIs will be implemented to retrieve records from a “Cross-Region Partition Group”. For simplicity, in PoC, only one new global query API “GET /api/storage/v2/global_query/kinds” will be created in the Storage Service as the global counter for local API “GET /api/storage/v2/query/kinds”.
In Wellbore DDMS and Well Delivery DDMS, there are several local query APIs. The global version of these APIs will need to be added in actual implementation. For simplicity, they will not be included in PoC.
The request body and request headers for the new global APIs are the same as the local versions except the Partition ID in the request header. The new APIs require a “Cross-Region Partition Group” ID instead of a single Partition ID. This ID specifies the default partitions for the search. An optional query parameter “remote_partitions” allows user to select a subset of the default partitions to search. For example, if a group includes “Instance 1: Partition A, Instance 2: Partition A and Instance 3: Partition A” and “remote_partitions” query parameter has value of “Instance 1: Partition A, Instance 3: Partition A”, the global search will skip searching in Instance 2: Partition A. Optional query parameter provides flexibility for users to skip remote partitions to optimize performance.
The global APIs return the aggregation of the search results from all the partitions belonging to the group or the subset specified in the optional query parameter. The local partition is always implicitly included in the search. Optional query parameter will be validated against the “Cross-Region Partition Group” ID in the header. If any of the remote partitions does not belong to the group, “400 Bad Request” error is returned. For simplicity, validation will not be implemented in PoC.
```
POST /api/global_search/v2/query?remote_partitions=p1, p2, …
Header: cross-region-partition-group-id
```
```
POST /api/global_search/v2/query_with_cursor?remote_partitions=p1, p2, …
Header: cross-region-partition-group-id
```
```
GET /api/storage/v2/global_query/kinds?remote_partitions=p1, p2, …
Header: cross-region-partition-group-id
```
For the new global APIs, the user must meet the entitlement requirements for all the participating partitions’ data access. Otherwise, global APIs will return “401 Unauthorized” error.
# Testing Federated Search under Multi-Region Deployment
To adequately test latency numbers, one multi-region deployment and multiple tests will be carried out for PoC.
### Deployment and configuration
Three OSDU on Azure instances will be created in three different Azure regions across different continents (West US, West Europe, and East Asia) with each instance having one data partition. Manually create the “Regions” table and a “Cross-Region Partition Groups” table with one group that includes the three partitions from the three instances. Configure remote settings for the three Elastic Search clusters to have two remote clusters for each using Elastic Cluster Update Settings API.
### Tests
Upload different test data sets to each instance and run manual tests in Postman. Test local search and storage query APIs and their counter global APIs with the same request body and log response time for each test. When running global API tests, choose different optional query parameters to select different number of remote partitions. Test with different kinds and availability of data. The delta between the response time from local request and that from global request can be a rough estimate for latency caused by cross region search.
| Local | Global |
| ------ | ------ |
| Search/query | Search/query with 1 remote partition |
| | Search/query with 2 remote partitions |
| Search/query_with_cursor Initial request| Search/query_with_cursor Initial request with 1 remote partition |
|| Search/query_with_cursor Initial request with 2 remote partitions |
| Search/query_with_cursor request |Search/query_with_cursor request with 1 remote partition |
| | Search/query_with_cursor request with 2 remote partitions |
| Storage/query/kinds| Storage/query/kinds with 1 remote partition |
| | Storage/query/kinds with 2 remote partitions |
### Next Step After Testing
After the above tests are done and latency data is collected, the direction for the next step will be determined from the latency data.
If the latency data does not meet SLA, it indicates Federated Search approach is not an acceptable option and we will move to the next architecture model of “External Partition” that requires EDS shadow record replication to decrease search latency.
On the other hand, if latency data is within acceptable SLA, an additional PoC step can be done, which is to test fetching relatively large amount of raw data from a remote region using File service. The details about this PoC step will be created later when deemed necessary.
### POC Code Branch
- Search POC branch – [Code repository](https://community.opengroup.org/osdu/platform/system/search-service/-/tree/multiregion_poc2)
- Storage POC branch – [Code repository](https://community.opengroup.org/osdu/platform/system/storage/-/tree/multiregion_poc2?ref_type=heads)https://community.opengroup.org/osdu/platform/deployment-and-operations/terraform-deployment-aws/-/issues/3Getting error while pulling images2023-01-10T10:52:30ZSudhakar AGetting error while pulling images
Hi,
when running the step 5 (Steps to Manually Deploy the Infrastructure with Make) and we got the below error.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
Hi,
when running the step 5 (Steps to Manually Deploy the Infrastructure with Make) and we got the below error.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Error: osdu-ingest/os-ingestion-workflow failed to fetch resource from kubernetes: context deadline exceeded
│
│ with module.ingest.module.ingestion_worflow.kubectl_manifest.deployment,
│ on ingest/ingestion_workflow/kubernetes.tf line 41, in resource "kubectl_manifest" "deployment":
│ 41: resource "kubectl_manifest" "deployment" {
│
╵╷
│ Error: osdu-seismic-ddms/os-seismic-store failed to fetch resource from kubernetes: context deadline exceeded
│
│ with module.seismic_store_services.module.seismic_store.kubectl_manifest.deployment,
│ on seismic_store_services/seismic_store/kubernetes.tf line 31, in resource "kubectl_manifest" "deployment":
│ 31: resource "kubectl_manifest" "deployment" {
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
we found that the images.json in location “terraform-deployment-aws/deployment” having docker service images which are pointing to repo (registry.repo.osdu.aws/airflow-dag-upload:latest) with image tag latest is giving the error. We tried pulling the images manually and it returned 404 not found error.
====================================================================================================================
docker pull registry.repo.osdu.aws/os-policy-service:latest
Error response from daemon: error parsing HTTP 404 response body: no error details found in HTTP response body: "{}"
====================================================================================================================
Can we please get some assistance on solving this error.https://community.opengroup.org/osdu/platform/system/storage/-/issues/159Storage adds null meta to record ingested without2023-03-22T04:11:53ZAn NgoStorage adds null meta to record ingested without1. Record was ingested without specifying "meta" block. PUT api was successful.
2. Fetch the ingested record. Notice that Storage added "meta": null to the record.
**Checking with Search.**
Search indexed successfully. Status code was 2...1. Record was ingested without specifying "meta" block. PUT api was successful.
2. Fetch the ingested record. Notice that Storage added "meta": null to the record.
**Checking with Search.**
Search indexed successfully. Status code was 200.
Search result does not return the meta.
The current behavior is challenged saying that Meta block shouldn't have been added. Or if added, then it should be empty and not null.
So instead of adding:
"meta": null
It should be:
"meta": []
Upon creating or updating a record, providing an empty meta block should also be allowed.