Home issueshttps://community.opengroup.org/osdu/platform/home/-/issues2024-03-12T14:54:20Zhttps://community.opengroup.org/osdu/platform/home/-/issues/55Azure GLAB/Pre-ship environments - change in authentication process2024-03-12T14:54:20Zsaketh somarajuAzure GLAB/Pre-ship environments - change in authentication process## Background
- Due to recent security changes in the azure tenant, to access the osdu environments, one must have personal access token to test the services and workflows.
## Prerequisites
- Person who wants to access the environment...## Background
- Due to recent security changes in the azure tenant, to access the osdu environments, one must have personal access token to test the services and workflows.
## Prerequisites
- Person who wants to access the environment (**Azure GLAB / Azure Pre-ship** should be invited to the azure tenant, Post accepting the invitation, one should complete the onboading process which includes setting up authenticator application/MFA.
- Person should have `tenant_id`, `client_id`, `client_secret` handy, corresponding to the environement which he/she/they are generating access token.
### How request
- Approach azure team in slack or comment on this issue
## Procedure to create Personal Access token ( access_token )
- Get `tenant_id`, `client_id`, `client_secret` handy of the specific environment (**Azure GLAB / Azure Pre-ship**) to which access token is being generated.
- In case of required access for both GLAB and preship environments, this process should be followed seperately with two different sets of corressponding `client_id`, `client_secret`.
- **Step 1.1**: Prepare the below URL replacing the parameters **<tenant_id>**, **<client_id>** with actual values of targetted environment.
```bash
https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/authorize?client_id=<client_id>&response_type=code&redirect_uri=http://localhost:8080&response_mode=query&scope=<client_id>%2f.default&state=12345&sso_reload=true
```
- **Step 1.2**: After you replace the parameters, you can paste the request in the URL of any browser (incognito window/private tab) and select Enter.
- **Step 1.3**: Sign in to your Azure portal with your organisation email id and its corressponding password.
- **Step 2.1**: Now the browser authenticates the user and You might see the "Hmmm...can't reach this page" error message in the browser. You can ignore it. <br>
![localhost-redirection-error](/uploads/cec250436b964112198d055cb3dac4ce/localhost-redirection-error.png)
- The browser redirects to `http://localhost:8080/?code={authorization code}&state=...` upon successful authentication.
- **Step 2.2**: Copy the response from the URL bar of the browser and fetch the text between **code=** and **&state**. This is known as **code** <br>
ex: http://localhost:8080/?code=**0.BRoAv4j5cvGGr0...au78f**&state=12345&session....
- Save the **code** as a postman environment variable `code` which will be used in next step.
- **Step 2.3:** Replace **<tenant_id>**, **< code >** (code in step 2.2), **<client_id>** and **<clent_secret>** in the following curl request, hit this curl request using postman tool.<br>
- On postman click on new request.
![new_postman_request](/uploads/6518b2026194ffbbf02e32c307e39f80/new_postman_request.JPG)
- Paste the following curl request in the URL section of the new request
![curl_request](/uploads/56d05fc44b1c81321521921fc9e13fa0/curl_request.JPG)
``` bash
curl --location --request POST 'https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data grant_type="authorization_code" \
--data redirect_uri=http://localhost:8080 \
--data client_id="<client_id>" \
--data client_secret="<client_secret>" \
--data scope="<client_id>/.default openid profile offline_access" \
--data code={{code}}
```
- Sample response
```bash
{
"token_type": "Bearer",
"scope": ".....",
"expires_in": 4557,
"access_token": "eyJ0eXAiOiJKV1QiLCJub25jZSI6IkJuUXdJd0ZFc...",
"refresh_token": "0.ARoAv4j5cvGGr0GRqy180BHbR8lB8cvIWGtHpawGN..."
}
```
- **Step 2.4:** Save **access_token** , **refresh_token** in your local postman environment
- **Step 3.1:** Test couple of OSDU services with this **access_token** to make sure you can access the OSDU environment with newly generated token.
---
## Common issues one can face during this process
1. Not replacing placeholders ( **<tenant_id>**, **< code >**, **<client_id>** and **<clent_secret>** )with correct values in the URL/Curl request
**Resolution:** Double check url and curl request and make sure correct values are being replaced in the placeholders.
2. **< code >** generated during step 2.2 will expire after 1 hour. Follow the process again from Step 1.1 if the **< code >** expires.
3. It is recommended to use postman tool for step 2.3 rather than git bash, windows cmd, etc.
4. It is expected to face error "Hmmm...can't reach this page/ Can't found the URL" in the browser for step 2.1https://community.opengroup.org/osdu/platform/home/-/issues/27Formalize need for BYOC backend for new project/service contributions.2024-03-08T12:59:35ZRaj KannanFormalize need for BYOC backend for new project/service contributions.When we have a contribution of a new project or a new service to OSDU, different dev teams need to be able to quickly try out the code and understand the behavior beyond the overview/walkthrough done by the contributing team.
However in...When we have a contribution of a new project or a new service to OSDU, different dev teams need to be able to quickly try out the code and understand the behavior beyond the overview/walkthrough done by the contributing team.
However in order for our dev teams to be able to run/debug the service and associated integration test themselves, requires the dev team to have access to the deployment environment of the contributor and their underlying CSP secrets, keys etc., which causes latency in understanding and adoption of code within OSDU platform.
BYOC back-end is intended for this purpose, but we haven't required this as part of any contribution. It may be worth considering this as a first step in project contribution after the code is authorized for inclusion in community Gitlab, so we can make the above easier.
### Requirements
* BYOC back-end implementation for any new service or project contribution to OSDU platform
* Ability for all CSPs and other platform developers to be able to run the service, test-suite independent of the original contributor's infrastructure
* Associated CI/CD updates to ensure it can be run independently w/BYOC and (ideally) on a developer workstation.
### Operator Input
* Pending, but this is more of a PMC governance and process streamlining issue.
### Definition of Done
* After Code is scanned and validated for contribution, contributing team works on a BYOC backend
* new service and/or project is able to pass the integration tests with a BYOC backend
* basic documentation is included on how to build/run against BYOC to learn about this service or projectM1 - Release 0.1Raj KannanJoeRaj Kannan2020-07-10https://community.opengroup.org/osdu/platform/home/-/issues/54Community Driver/Mapper Contributions: Repository Assignment2024-02-13T11:20:58ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comCommunity Driver/Mapper Contributions: Repository Assignment# ADR: Community Driver/Mapper Contributions: Repository Assignment
## Status
- [ ] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context
The Community Implementation of the OSDU Platform will feature a ...# ADR: Community Driver/Mapper Contributions: Repository Assignment
## Status
- [ ] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context
The Community Implementation of the OSDU Platform will feature a reusable and tested foundation, which can be deployed with custom technology stacks using sets of Drivers and Mappers, a.k.a. the 'South Decision Point.' This approach enables the customization of underlying resources for each cloud or specific environment without necessitating code changes. The framework accommodates the contribution of new Drivers/Mappers at a later stage.
https://community.opengroup.org/osdu/platform/system/lib/drivers
https://community.opengroup.org/osdu/platform/system/lib/mappers
## Problem Statement
If anyone wishes to introduce new sets of drivers/mappers, they may be unsure where to contribute them. This decision involves whether to add them to the existing Community repository, such as https://community.opengroup.org/osdu/platform/system/lib/drivers/os-obm, or to create a new repository not directly affiliated with community projects.
## Decision Options
**1. In the same repository, as a new module, for example,** https://community.opengroup.org/osdu/platform/system/lib/drivers/os-obm
Pros:
- Easier to align API (Java Interfaces) updates with custom drivers and update them accordingly.
- Ability to check whether updates are compatible with custom drivers.
- Easier to maintain versioning. Auto version increments could be done in place for each set of drivers.
Cons:
- Issues with custom drivers may affect the stability of community drivers. Vulnerability scans, build issues, failed tests, etc.
- Breaking changes in the driver API should be implemented by custom driver contributors without delays, or it could break the flow.
Sum: **This approach would favor custom drivers more, and the community implementation could be affected.**
**2. As a separate project, for example in group** https://community.opengroup.org/osdu/platform/system/lib
Pros:
- The community set of drivers will be kept isolated, simplifying their maintenance, etc.
- Updates to the driver API can be implemented by custom driver contributors at their own pace.
Cons:
- Breaking changes could be harder to adopt.
- Harder to keep versions up to date, custom driver maintainers should keep up to use the latest version.
Sum: **This approach would favor community drivers more, custom driver maintainers should keep up.**
## Decision
TBD
## Consequences
TBDhttps://community.opengroup.org/osdu/platform/home/-/issues/52ADR - Release management change for Core Libraries2024-02-13T10:27:00ZRene von Borstel [EPAM]ADR - Release management change for Core Libraries## Decision Title
Release management change for Core Libraries
## Status
- [x] Proposed
- [ ] Approved
- [ ] Implementing (incl. documenting)
- [ ] Testing
- [ ] Released
## Purpose
Change in release process for Core Libraries to have...## Decision Title
Release management change for Core Libraries
## Status
- [x] Proposed
- [ ] Approved
- [ ] Implementing (incl. documenting)
- [ ] Testing
- [ ] Released
## Purpose
Change in release process for Core Libraries to have reduced impact on Code tagging process for milestone releases.
## Problem statement
Right before the code freeze, we have core library and all services upgrades during that milestone. If we have a major upgrade e.g. spring boot update or Jackson Library gets upgraded from some older to new version, because the services have been working on some older version of core libraries, we will see that there will be a compile time errors or runtime errors on all the services in most of the time. That will actually impact the stability of the system, because now you stop all your development work in order to sanitize the release branch so that the service is up and running and all the items are passing being an additional overhead that we are taking.
## Proposed solution
Core libraries are not something that are shipped to the customers and are used internally within the OSDU community internally. Hence, they do not need to follow the milestone versioning.
We can avoid the above mentioned of upgrading library versions in services at every release by maintaining the following versioning strategy for Core Libraries
- Create independent versioning of Core Libraries.
- Do not cut a release branch at every release.
- Follow the following versioning strategy while rolling out new versions for Core Libraries.
- Major Version
- Create a new major version when the release contains Backward incompatible changes in Interfaces or Model classes.
- For eg: `id` in `Record` class is changed to `recordId`.
- Minor Version
- Use minor version when additional methods are added to Interfaces, new fields are added to Model classes
- Changes in versions of dependencies - Springboot, Jackson etc.
- Patch Version
- Increment patch version when Bug fixes or Security patches are applied to the Library.
With this approach we avoid patching core libraries right before the release and thereby, reduce the amount of time spent on Stabilizing the service during code tagging process.
## Consequences
- We retire the -rc* versioning strategy. We no longer create release candidates in Core Libraries.
- Every commit on the Core Library will end up creating a new version depending on the type of the change.
## Target Release
M14
## Owner
Please contact @krveduruDavid Diederichd.diederich@opengroup.orgChad LeongDavid Diederichd.diederich@opengroup.orghttps://community.opengroup.org/osdu/platform/home/-/issues/53[ADR-0009 of wg-data-architecture] Universal_Data_Content_ARRAY_of_Values_API2023-12-18T15:55:30Zjean-francois RAINAUD[ADR-0009 of wg-data-architecture] Universal_Data_Content_ARRAY_of_Values_API## Universal_Data_Content_ARRAY_of_Values_API
* The objective of this ADR is to propose to define a Common API to access all types of optimized storage of Data Content Array Of Values.
* The Information required to specify the behaviour...## Universal_Data_Content_ARRAY_of_Values_API
* The objective of this ADR is to propose to define a Common API to access all types of optimized storage of Data Content Array Of Values.
* The Information required to specify the behaviour of this API should be available from the catalog ( in a shared context)
* This API should be implemented to access the optimized Storage Content Array of Values on the supports provided by the diverse DDMSs (e.g: parquet file, oVDS file Collection, PostGreSQL blobs) or from the Catalog itself.
* The overall objective is to allow a Given DDMS-1 to link the DataValues provided previously by another DDMS-0 on its prefered "DDMS-0 native" support to a DDMS-1 Data Content schema Entity and allows to get the Values directly from the DDMS-0 support without copy it on a DDMS-1 support.
## Status
* [x] Proposed
* [ ] Approved
* [ ] Implementing (incl. documenting)
* [ ] Testing
* [ ] Released
## Context & Scope
Main objective : facilitate the delivery of optimized information gathered in the OSDU platform by Datasets and DDMSs
OSDU aims to be a cross-domain platform. Some core entities like Well and Wellbore are relevant to many domains, which may want to associate domain specific properties with the entities on different DDMSs.
today a solution is presented to associate specific DDMS information to OSDU core entities in https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Guides/Chapters/93-OSDU-Schemas.md#appendix-d34-x-osdu-side-car-type-to-side-car-relationship
Goal 1: "Slim Entities" It is OSDU's goal to keep the shared context relevant to everybody (=domain independent) unambiguous.
Goal 2: "Agile Domains" RDDMS Domains must be empowered to promote change for the benefit of the domain without impacting all other domains. As a consequence schema must be split into,a shared context for interoperability, and a bounded, domain specific context for the domain.
Side-Car Pattern for Schemas The shared context is captured by the 'main', domain-independent entities as schema definition - in the analogy the 'motorbike'. Bounded context by a domain is added by a side-car schema. The side-car entity extension refers to the shared context by id. This is illustrated in the following diagram: see OSDU_ARCHITECTURE/side_car_capture.png
The center column shows the shared context. Generic discoverability is provided via platform services like Search and GIS. Domain specific extensions are defined by the domains independently. Such extensions can use domain driven language, which may be ambiguous outside the bounded domain context. Often domains create their own Domain Data Management Services (DDMS). Such services understand the composition of shared and bounded contexts and can shield applications and users from the complexity of the side-car record implementations. This means that the DDMSs can return the combination of the bounded and shared contexts on queries.
But it could be interesting also to ensure that DDMS 2 can access directly to Data Array Values attached to Core Entities defined in the shared context and generated by a DDMS 1. Because if it is not the case (as today) for an application we can have two situations : The Data array Values should be accessed by the API of the DDMS 1 without any relationship with the DDMS 2 bounded Context. The Application will have to rebuild relationships. The Data array Values should be firstly accessed by the API of the DDMS 1 in the bounded context of the DDMS 2, and this Data Array Values in another "shape" should be copied in the DDMS 2 and attached in its proper bounded context.
Difficulty one : These two use case are not satisfactory and are very common (Seismic DDMS \<-\> Reservoir DDMS, Well DDMS \<-\>Reservoir DDMS, Seismic DDMS \<-\> WellDDMS, RAFSDDMS -\> Reservoir DDMS)
A Difficulty two is the fact that today in the Core entities we do not have "mandatory" informations when we define the Data Array Values properties (types (boolean, integer, float, doubles, string), nb of columns, size of columns). In this case if an Application does not deliver this information in the Catalog it will be impossible to another to read and use it. The interoperability between Applications will be impossible because the DDMSs cannot deliver all content.
this ADR intend to address these two difficulties
## Decision to be made
The share context could take care to define totally how all applications can access Data Arrays of Values on file Data Content and DDMSs Data Content. Data Array of Values are not difficult to describe and we could deliver a Data Array Values API abstraction level which could be used for all file Data Content and all DDMS Data content. Using this abstract API level all Data Array Values of File Data and DDMS Data Content could be accessible by a DDMS which was not at the origin of these Data Array Values.
Note : The information to authorize access to all Data Array values should be mandatory (see description of the method item 3/).
Description of the proposed method to apply if the Data Array values are embedded into an external content (and if we add on a WPC (like welllog) an abstract Colummn Based Table) :
1/ The WPC designed to deliver the data content should have a link to a persistent support (e.g: mentioning a Datasetfile (could be a parquet file), DatasetfileCollection (could be an oVDS collection), uri of a DatasetETPdataspace, urn: etc..)
2/ Inside this dataset persistent support the Data Array of values concerning this WPC will be associated to the id of the WPC : (e.g: "id": "namespace:work-product-component--WellLog:c2c79f1c-90ca-5c92-b8df-04dbe438f414")
3/ And just after inside the WPC the different information attached to the Data Content could be also accessible : "ColumnName", "ValueType" (double, number, string, boolean), "ValueCount" (nb of columns or dimensions), "Column size" (nb of values in the column), it could be more detailed in ValueType (see after : energistics Data type ETP V1.2 documentation) for each "Column name" we should have also: "UnitofMeasureID" + "UnitQuantityId + PropertyTypeID
IMPORTANT WARNING : we should find a way to impose that this information MUST be present in the WPC (e.g: by enhancement of the Validation step during ingestion)
By default no more information is given but this looks enough to proceed.
ex: for WellLog WPC here are the Information to deliver into the Catalog : "id": "namespace:work-product-component--WellLog:c2c79f1c-90ca-5c92-b8df-04dbe438f414" "DDMSDatasets": \[ "[urn://wddms-3/uuid:20840361-adc0-4842-999b-5639bd07bb38](urn://wddms-3/uuid:20840361-adc0-4842-999b-5639bd07bb38)" { "ColumName": "CO2-SAT-Fraction-VP", = "Array meta data in Energistics ETP V1.2" "ValueType": "double", = "DataArrayType in Energistisc ETP V1.2" "ValueCount": 1, "ColumnSize": 7, = "dimension in Energistics ETP V1.2" "UnitQuantityID": "namespace:reference-data--UnitQuantity:unitless:", "PropertyType": { "PropertyTypeID": "namespace:reference-data--PropertyType:8a9930de-6d50-4165-8bcd-8ddf2e6aa7fa:", "Name": "Co2 Volume Fraction" }, },
"Value Type" reference in Energistics ETP V1.2 documentation.
"Energistics.Etp.v12.Datatypes.ArrayOfBoolean", "Energistics.Etp.v12.Datatypes.ArrayOfNullableBoolean", "Energistics.Etp.v12.Datatypes.ArrayOfInt", "Energistics.Etp.v12.Datatypes.ArrayOfNullableInt", "Energistics.Etp.v12.Datatypes.ArrayOfLong", "Energistics.Etp.v12.Datatypes.ArrayOfNullableLong", "Energistics.Etp.v12.Datatypes.ArrayOfFloat", "Energistics.Etp.v12.Datatypes.ArrayOfDouble", "Energistics.Etp.v12.Datatypes.ArrayOfString", "Energistics.Etp.v12.Datatypes.ArrayOfBytes",
We can note that we have today an " existing" method if the Data Array values are not important in size and should preferently be embedded in the catalog. this one is restricted on "value type" to the original list. In this case we should a tag "ColumnValues" with "number" or "double" or "string" or "boolean".
\*\* Now If this information is embedded into the shared context we will be able to access to data Content ARRAY of Values.
On the base of these information we could provide the specification of an API to refers, write and read this Data Content values arrays.
This API could then be used "internally" by all DDMSs to associates these Values to their own Data Content schema ( bounded context). Depending on the context on which the Content Data Array of Values were stored, All DDMS could be able manage these data. On a firts step , each "ingestion DAG" or "DDMSs" could use this information to associate Data Content and shared context.
They could use DataArray specific services to transfer large, binary arrays of homogeneous data values. For example, with Energistics domain standards (see ETP V1.2 protocol 9 page 287 : https://docs.energistics.org/EO_Resources/ETP_Specification_v1.2_Doc_v1.1.pdf), this data is often stored as an HDF5 file.
This API could provide a DataArray transfer which :
* Supports any array of values of different types (boolean, integer, float, doubles, string). This array data is typically associated with a data object (that is, it is the binary array data for the data object).
* Imposes no limits on the dimensions of the array. Multi-dimensional arrays have no limits to the number of dimensions.
* Was originally designed in Energistics standard to support transfer of the data typically stored in HDF5 files but also can be used to transfer this type of data when HDF5 files are not required or used(e.g: parquet files, oVds File Collection, PostGreSQL bulk data, Time series DB)
# Rationale
This proposal is based on experiences gathered by the Energistics standards teams : effective separation between meta data on Array of Values (written in XML or JSON files) and Array of Values (written in binary compressed format).\
All DDMS could discuss with the Catalog at the meta data level on the Content Data Array Values A DDMS can refer a Content Data Array of Values without copying the Data Array of Values and can beneficiate of an optimized access on Array of Values developed by another DDMS.
## Consequences
From a first query on the Catalog, all Data Content Array of Values will be accessible directly of through a more sophisticated DDMS query.
This will not imply a lot of change in the shared context Data Definition side : e.g: update of the abstract Column based Table and add it to all WPC which must handle Data Content Array Values. It is possible that some more data definition effort should be necessary to cover all data content Array of values handled by the diverse DDMS. The API of the DDMSs themselve will not change but the link between the shared Context and each bounded context should be updated. all DDMS should deliver a way to reference, write and read their specific Data Content Array of Values from information contained in the Catalog (shared context)https://community.opengroup.org/osdu/platform/home/-/issues/48Removing Airflow 1.x support in favor Airflow 2.x for M12+2023-12-05T10:33:23ZChad LeongRemoving Airflow 1.x support in favor Airflow 2.x for M12+## Existing Practice / Background
Airflow 2.x was introduced in M10 release for various improvements over Airflow 1.x. In M10 Airflow 2.x was released with feature flags so that customers can choose to stay at Airflow 1.x as well.
## M...## Existing Practice / Background
Airflow 2.x was introduced in M10 release for various improvements over Airflow 1.x. In M10 Airflow 2.x was released with feature flags so that customers can choose to stay at Airflow 1.x as well.
## Motivation
The OSDU community has decided to deprecate the support for Airflow 1.x and will completely remove the ability to stay at Airflow 1.x starting in M12 milestone.
CSP Decision Approvals
- [x] AWS
- [x] Azure
- [x] IBM
- [x] GCPhttps://community.opengroup.org/osdu/platform/home/-/issues/44[ADR] Avoid the need to provide persistable reference information (Unit syste...2023-11-17T17:48:56ZDebasis Chatterjee[ADR] Avoid the need to provide persistable reference information (Unit system, Coordinate Reference System)## Status
- [x] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context
Currently, user needs to provide both the Reference Entity information and the persistable reference.
Evident when user needs to specify...## Status
- [x] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context
Currently, user needs to provide both the Reference Entity information and the persistable reference.
Evident when user needs to specify unit of measure or coordinate reference system.
This is inefficient and is also error prone.
What if the Data Loader or the User makes a mistake in persistable reference value and the values are inconsistent?
The proposal is to save the user this trouble. Let the user provide link to existing Reference entity and ID.
However, programs such as Manifest-based Ingestion could query Reference value and add required line in JSON file being used to actually store/populate record.
See this high level diagram for an understanding of this proposal.
[ADR-for-persistableReference-issue.pptx](/uploads/95b748500e8f64544ff64a492836515d/ADR-for-persistableReference-issue.pptx)
You can find historical context in the following issues:
- https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-dags/-/issues/92
- https://community.opengroup.org/osdu/platform/system/reference/unit-service/-/issues/24
- https://community.opengroup.org/osdu/platform/system/reference/crs-conversion-service/-/issues/25
## Scope
Make suitable change in code :
- Manifest-based Ingestion (both unit and CRS)
- Unit conversion
- CRS conversion
Likely impact for CSV Parser and WITSML Parser.
## Decision
Analyze impact (if adverse) caused by this suggested change :
1. Approve change to Manifest-based Ingestion
2. Approve change to Unit Conversion API
3. Approve change to CRS Conversion API
## Rationale
The proposal is cleaner approach since the lengthy persistable reference can be daunting for some of the users.
There is chance of user making mistake in the string of persistableReference.
Also, there can be contradiction in provided input (such as reference to certain entry in Reference data but different persistableReference).
## Consequences
- Revise sample JSON files (used for loading TNO, Volve data)
- Revise test Postman collections (Platform Validation team)
- Revise documentation (Data Loading)
- Revise API Swagger documentationhttps://community.opengroup.org/osdu/platform/home/-/issues/31Partition service2023-10-20T12:34:01Zashley kelhamPartition service## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
The OSDU data platform supports the concept of data partitions and the availability of multiple partitions for a single deploymen...## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
The OSDU data platform supports the concept of data partitions and the availability of multiple partitions for a single deployment.
A data partition provides the highest level of data isolation within a single OSDU deployment. All access rights are governed at a partition level and data is separated in a way that allows for the partitions life cycle and deployment to be handled independently from one another.
![n](/uploads/841b498ac93e5da86571878308db378e/n.png)
A data partition helps solve many concerns
- Data Access – More sensitive data can be deployed to a separate partition providing an isolated access boundary separate to the rest of their data providing higher security
- Data Management – Organizations can distribute partitions to different segments/units for self-management where appropriate for reasons of budgeting, access rights, rights of use, licensing etc.
- Joint Ventures - Partitions can be set up for joint ventures between different organizations
- Data Residency – An organization can deploy a partition to a location with unique residency/sovereignty concerns to the rest of their data. Data partitions are an orthogonal concern to the region/location of a deployment but can be used to help simplify concerns in cases of data sovereignty and residency concerns.
- Development - Organizations want to develop applications on top of their deployments. Data partitions allow the test data used in development to be kept completely separate from their real data stopping pollution scenarios in their workflows.
- SaaS delivery - ability to deliver different partitions for different customers. Supporting encryption of data in different partitions with customer specific keys for additional data protection.
Most of the APIs in OSDU expose a custom header 'data-partition-id'. The value of this points to the data partition the client is requesting access to.
In the current implementation the specific properties that relate to a partition are largely captured through the shared ‘TenantInfo’ data object. Other properties specific to a partition e.g. secrets are often stored in different key vaults or databases and also need to be retrieved.
These are then high-frequency usage points in the system and we want to rationalize these concerns in a central service to help maintain the integrity of the system.
## Decision
For R3 we will encapsulate the storage and retrieval of partition specific properties within its own service.
## Rationale
This will encapsulate the storage and retrieval of partition specific information. This will decouple implementations away from shared databases improving maintainability.
We can also encapsulate the performance, reliability, and scaling needs behind this service. Retrieving partition information is a high frequency, single point of failure in the system so having each service duplicate this increases the risk of service availability.
## Consequences
Deployments will need to switch to store the partition specific properties via the service.
Services will need to be switched to retrieve partition specific properties via the service.
These changes can happen gradually overtime and should be specific to each provider. One providers consumption of this new service should not affect another providers implementation.
# Trade-off Analysis - Input to decision
An alternative approach is the current Java shared code implementation. Here we see that a large amount of the partition specific properties are retrieved in a single POJO called ‘TenantInfo’. Other partition specific properties like Elastic Search credentials are retrieved from secret stores. There may be other variations on where this data may be retrieved from.
All of these are single points of failure in the system that are retrieved on every request. If you cannot get these properties, the request cannot be completed for any service that supports the data-partition-id header.
Therefore, each service must implement the logic to retrieve this information or at least each language/framework in OSDU needs a shared component for the most common properties.
These properties are needed on every request and having multiple implementations make a performant, elastic and reliable implementation both less likely and harder to achieve.
Similarly coupling all services to specific databases in the system makes maintainability worse as there is tighter coupling of the system so any changes at the data layer end up requiring cascading changes in multiple services.
However, these implementations do already exist and are working today. There is an effort involved in both creating this service and then migrating the other services to use this instead.
## Decision criteria and trade-offs
- Elasticity
- Performance
- Reliability
- Maintainability
- Cost of changeM1 - Release 0.1ethiraj krishnamanaiduFerris ArgyleDania Kodeih (Microsoft)Wladmir FrazaoJoeMatt Wiseethiraj krishnamanaidu2020-08-28https://community.opengroup.org/osdu/platform/home/-/issues/26Trade off analysis between generic Plug & Play vs Service Replacement2023-10-20T12:32:14ZMeena RathinavelTrade off analysis between generic Plug & Play vs Service ReplacementWe need to analyze and align on Plug & Play service vs Service Replacement
Also understanding whether this would be canonical for all customers or will be within **only OSDU **
The above is an action item coming from our weekly PMC Pr...We need to analyze and align on Plug & Play service vs Service Replacement
Also understanding whether this would be canonical for all customers or will be within **only OSDU **
The above is an action item coming from our weekly PMC Program Review meeting on 06/16M1 - Release 0.1Stephen Whitley (Invited Expert)Stephen Whitley (Invited Expert)2020-06-30https://community.opengroup.org/osdu/platform/home/-/issues/28Issue Taxonomy2023-10-20T12:32:04ZStephen Whitley (Invited Expert)Issue TaxonomyWe are often left to address the gaps from architectural principles (which stay at a pretty high and abstract level) to the actual implementation detail. Here is an attempt to bridge that gap by providing a set of Lightweight Architectur...We are often left to address the gaps from architectural principles (which stay at a pretty high and abstract level) to the actual implementation detail. Here is an attempt to bridge that gap by providing a set of Lightweight Architecture Decision Records (LADRs) which are simple to follow and can be implemented in a given team/project by the developers
# Decision Title
Taxonomy for issues to ensure efficient tracking and governance of PMC projects
## Status
- [x] Initiated
- [x] Proposed
- [x] Trialing
- [x] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The proposal can be found [**here**](https://community.opengroup.org/osdu/governance/project-management-committee/-/wikis/PMC-Issue-Taxonomy) for review.
It is suggested that after a first pass review of the proposal to refine, that this is put to trial for the System and Data flow projects using the proposed methodology.
The taxonomy guideline can be further refined based on practical application in these projects and then baselined.
## Decision
Stage-1: Review with CSP leads, program managers and OSDU lead to get first pass feedback.
## Rationale
Gitlab issue list and boards are very generic and we need to be able have visibility at program level for macro tasks and be able to define, assign and manage micro level tasks within a repo or sub-project level as well using the same system.
## Consequences
The current issue list is unwieldy and therefore becomes a management and governance challenge. A working model is in order to get us to a productive environment in Gitlab.
## When to revisit
Suggest 3 sprints of practical application in System and Data flow projects and to review after that
Aug 2020
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
* Ability to manage and track ADR record lifecycle
* Ability to manage and track macro backlog items for program level reporting, detailed backlog and defect items at project level for assignments
* Ability to define CSP specific issues and defects and track their assignments and progress
## Decision timeline
Aug 2020M1 - Release 0.1Raj KannanJoeRaj Kannanhttps://community.opengroup.org/osdu/platform/home/-/issues/24OSDU Code tagging and versioning2023-10-20T12:31:37Zethiraj krishnamanaiduOSDU Code tagging and versioningWe are often left to address the gaps from architectural principles (which stay at a pretty high and abstract level) to the actual implementation detail. Here is an attempt to bridge that gap by providing a set of Lightweight Architectur...We are often left to address the gaps from architectural principles (which stay at a pretty high and abstract level) to the actual implementation detail. Here is an attempt to bridge that gap by providing a set of Lightweight Architecture Decision Records (LADRs) which are simple to follow and can be implemented in a given team/project by the developers
# Decision Title
## Status
- [ ] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
## Decision
## Rationale
## Consequences
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelineethiraj krishnamanaiduethiraj krishnamanaiduhttps://community.opengroup.org/osdu/platform/home/-/issues/40Security Vulnerability issues for M9 (Assigned to IBM team)2023-10-05T11:16:27ZChris ZhangSecurity Vulnerability issues for M9 (Assigned to IBM team)Address all ‘critical’ and ‘high’ severity vulnerabilities in this report: https://community.opengroup.org/groups/osdu/platform/-/security/vulnerabilities.
This issue is to track IBM team's tasks on:
1. CRS Conversion - @JINGDONG SUN(IB...Address all ‘critical’ and ‘high’ severity vulnerabilities in this report: https://community.opengroup.org/groups/osdu/platform/-/security/vulnerabilities.
This issue is to track IBM team's tasks on:
1. CRS Conversion - @JINGDONG SUN(IBM)
1. Unit - @JINGDONG SUN(IBM)
**Additional Information**
The link below contains information on the vulnerability remediation process. For additional information on how to close out a vulnerability after it has been resolved please review step 7.
https://community.opengroup.org/groups/osdu/platform/-/wikis/Vulnerability-RemediationM9 - Release 0.12jingdong sunjingdong sunhttps://community.opengroup.org/osdu/platform/home/-/issues/42Security Vulnerability issues for M9 (Assigned to MSFT team)2023-10-05T11:16:18ZChris ZhangSecurity Vulnerability issues for M9 (Assigned to MSFT team)Address all ‘critical’ and ‘high’ severity vulnerabilities in this report: https://community.opengroup.org/groups/osdu/platform/-/security/vulnerabilities.
This issue is to track MSFT team's tasks on:
1. Schema - @Madhur Tanwani(MSFT)
1...Address all ‘critical’ and ‘high’ severity vulnerabilities in this report: https://community.opengroup.org/groups/osdu/platform/-/security/vulnerabilities.
This issue is to track MSFT team's tasks on:
1. Schema - @Madhur Tanwani(MSFT)
1. Infra-azure-provisioning - @Madhur Tanwani(MSFT)
**Additional Information**
The link below contains information on the vulnerability remediation process. For additional information on how to close out a vulnerability after it has been resolved please review step 7.
https://community.opengroup.org/groups/osdu/platform/-/wikis/Vulnerability-RemediationM9 - Release 0.12Madhur Tanwani [Microsoft]Krishnan GanesanMadhur Tanwani [Microsoft]https://community.opengroup.org/osdu/platform/home/-/issues/49DR: Issue priority and merge request labeling guide2023-08-17T10:42:31ZChad LeongDR: Issue priority and merge request labeling guide# Introduction
Today, during issues reviewing process in the daily dev call, we do not have a clear guideline on prioritizing issues to be fixed. We mainly rely on issue reporters to come up with a fix/merge request(MR). In cases where ...# Introduction
Today, during issues reviewing process in the daily dev call, we do not have a clear guideline on prioritizing issues to be fixed. We mainly rely on issue reporters to come up with a fix/merge request(MR). In cases where issues are reported without any fix/MR, depending on the urgency/impact, these issues still need to be addressed with the appropriate attention.
Similarly, we need to have clear labeling to understand the issue/MR reported.
This is an extension of the current [PMC Issue Taxonomy](https://community.opengroup.org/osdu/governance/project-management-committee/-/wikis/PMC-Issue-Taxonomy)
# Objective
Here is a proposal for the community to prioritize and categorize issues and MRs that are being reported. The labels will be used to use to indicate the state of the issues and merge requests.
## Issue labels
### Issue Type
| Category| Label | Description
| ------ | ------ | ------ |
| Under review | ~"Issue::Under Review" | This issue is currently under review, needs more information from the submitter and/or needs to be confirmed if it is the intended behavior of the service. Once confirmed, it could be a Defect, Backlog, ADR or DR. |
| Defect | ~"Issue::Defect" | A **Defect** is an issue that is a software error, flaw or fault that causes that project or repo or service to produce an incorrect or unexpected result per the OSDU standard or requirements or that it behaves in unintended ways. A defect can be further categorized as a defect in the common code in the PMC project or as a defect within the CSP realization of the PMC project to ensure that it can be targeted towards the right development resource to address this issue. |
| Feature | ~"Issue::Feature Request" | A **Feature** is an issue that is either a new requirement that needs software enhancement, new feature development (perhaps requiring new repos, sub-projects or new PMC projects). These need to follow a template (see below) that provides enough clarity on the requirement, the definition of done and other necessary attributes so the issue can be curated and moved up in life cycle. Requires an entry in Aha portal. |
| Non Issue | ~"Issue::Non Issue" |A **Non Issue** is an issue that is not necessarily a defect or backlog, could be used by developers as a task to track the ongoing activity, clean-ups |
| Architecture Decision Record - ADR | ~"Issue::Architecture/Technology" | A **architecture decision** (ADR)is a result of an issue backlog, defect or a new OSDU standard that has triggered the need for a new design (perhaps requiring new technology selections, architecture patterns) |
| Decision Record on a process - DR | ~"Issue::Process Decision" | A **decision record** (DR) is a result of a process shortcomings where a new OSDU practice has been triggered to address the existing process shortcomings (perhaps requiring new process, operation patterns) |
Issue should be flagged along milestone
| Label | Description |
| ------ | ------ |
| ~"M12" | Milestone where issue is discovered |
Issue labels show the state of the issues and should be used alongside priority labels to indicate the urgency of the issue and prioritize resources to address the issue.
### Issue life-cycle
| Category| Label | Description
| ------ | ------ | ------ |
| Backlog | ~"KB::Backlog" | Label applied to indicate that issues are confirmed, but no active work are in progress. Pending volunteers. Label should be used alongside Confirmed issue. |
| Fix in progress | ~"KB::In Progress" | Label applied to indicate that issues are confirmed and fixes are in progress. Label should be used alongside Confirmed issue. |
| Done | ~"KB::Done" | Label applied to indicate that issues are confirmed and fixes are done. Issues can be closed. Label should be used alongside Confirmed issue. |
### Affected responsibility
| Category| Label | Description
| ------ | ------ | ------ |
| Confirmed issue | ~"Common Code" ~"AWS"<br /> ~"Azure"<br /> ~"GCP"<br /> ~"IBM" | Label applied to identify issue that has been confirmed and is affecting for all CSPs (common), Azure, IBM , GCP or AWS. |
## Priority labels
Priority label needs to be assigned along to issue label to indicate the urgency. Developers/volunteers should work on issues according to the agreed priority label.
| Label | Description |
| ------ | ------ |
| ~"Priority::Critical" | <ul><li>Catastrophic issue identified - Severe impact, contain breaking workflow/data loss, zero-day/critical security vulnerabilities</li><li>No workaround and should be fixed as an immediate priority</li><li>Need to be released as a patch during regular milestone cycle as soon as a fix is available</li></ul>|
| ~"Priority::High" | <ul><li>Major issue identified - High impact, might contain breaking workflow/data loss, critical/high security vulnerabilities</li><li>There is a workaround that exists but should be fixed as the next priority</li><li>Might need to be released as a patch during regular milestone cycle/N+1 milestone release</li></ul>|
| ~"Priority::Medium" | <ul><li>Serious issue identified - Medium impact, no breaking workflow/no data loss, high/medium security vulnerabilities</li><li>There is a workaround that exists and should be fixed after high priority</li><li>Can be released in N+1 milestone release</li></ul>|
| ~"Priority::Low" | <ul><li>Minor issue identified - Low impact, no breaking workflow/any workflow, medium/low security vulnerabilities</li><li>There is a workaround that exists</li><li>Can be released in N+1 or more milestone releases</li></ul>|
## Use cases
### Issue
| Label | Description |
| ------ | ------ |
| ~"Issue::Defect" ~"KB::Done" ~"Common Code" | A labeling strategy for defect that has been resolved after related MR(s) are merged |
### Merge request
| Label | Description |
| ------ | ------ |
| ~"MR::Bugfix" ~"Common Code" | A labeling strategy for defect that has been resolved after related MR(s) are merged |https://community.opengroup.org/osdu/platform/home/-/issues/47Swagger UI throws "Whitelabel Error Page" for several services2023-08-02T10:04:35ZAn NgoSwagger UI throws "Whitelabel Error Page" for several servicesFor several services, the swagger page is showing "Whitelabel Error Page".
![image](/uploads/dc287779032213e02f5056e4390a23bb/image.png)
This error is observed after the swagger upgrade.
This is due to the reason that there are some m...For several services, the swagger page is showing "Whitelabel Error Page".
![image](/uploads/dc287779032213e02f5056e4390a23bb/image.png)
This error is observed after the swagger upgrade.
This is due to the reason that there are some missing/ inconsistent updates related to swagger for every service.
For example - In case of Entitlements Service, the swagger commit involved changes in the **[azure-istio-auth-policy.yaml](https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/merge_requests/175/diffs?commit_id=56fbdf335f8ab29fd794314118b2fb3e5654ad39#acaaad215c25b431962f64748aa1c65fe3699abd)** but [storage service](https://community.opengroup.org/osdu/platform/system/storage/-/commit/191597a47d10ba3f69e42148e3c5c1e7c7ca04f9#022b6d114052cad09a79e365f1af9879bf04fe63) is missing this update.
Same is the case for the other services.
- It is mostly because in some services, this file does not exist within the service repository but in the [**infra-azure-provisioning**](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/charts/osdu-istio-auth/templates).
- For some services like `Notification` - this same [**file**](https://community.opengroup.org/osdu/platform/system/notification/-/tree/master/devops/azure/chart/templates) exists with a different name. So its missed.
- Same for `Register` service - the [**file**](https://community.opengroup.org/osdu/platform/system/register/-/blob/master/devops/azure/chart/templates/osdu-istio-policy.yaml) exists with a different name
This inconsistent updates are likely causing the swagger page to be broken.
**Impacted services:**
- CRS Catalog Service
- CRS Conversion Service
- Legal Service
- Partition Service
- Register Service
- Seismic Data Management Service
- Search Service
- Spatial Ref Service (no swagger endpoint)
- Storage Service
- Unit Service
- Workflow Service
**These ones are working:**
- Entitlements
- Indexer
- Search Extension
- Schema
- Wellbore DMS
- File
- WorkflowSrinivasan NarayananSrinivasan Narayananhttps://community.opengroup.org/osdu/platform/home/-/issues/36Version Endpoint [GONRG-2681]2023-06-13T20:06:12ZDavid Diederichd.diederich@opengroup.orgVersion Endpoint [GONRG-2681]I'd like to propose a new endpoint for all services to retrieve the version information. I'm most interested in the tag version / upcoming tag version. My thinking is maven-centric, but I'd like to get the artifact version injected into ...I'd like to propose a new endpoint for all services to retrieve the version information. I'm most interested in the tag version / upcoming tag version. My thinking is maven-centric, but I'd like to get the artifact version injected into the jar files and available via a simple GET endpoint.
I have several use cases in mind right now:
1. The end customer should have a way to query their environment to know what versions they are running. Then they would know that patches they see coming in on community have been applied to their environment. The Admin UI may be able to use this endpoint to make that query easier.
2. Application developers would be able to query versions of the services they are working with to determine compatibility.
3. The CI pipeline can query the running instances of dependent services, and issue a warning if the major/minor doesn't match the currently executing one. Have branch names or commit hashes would further improve this, but isn't part of my initial thinking.
What complexities and challenges do you see in trying to provide this information?
https://jiraeu.epam.com/browse/GONRG-2681
Scope for M7:
storage,
search,
Indexer-queue,
Indexer,
LegalM8 - Release 0.11Kateryna Kurach (EPAM)Kateryna Kurach (EPAM)https://community.opengroup.org/osdu/platform/home/-/issues/50Release management for Core Libraries.2023-03-30T08:14:20ZKrishna Nikhil VedurumudiRelease management for Core Libraries.Change in release process for Core Libraries to have reduced impact on Code tagging process for Milestone releases
Ref: https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/79Change in release process for Core Libraries to have reduced impact on Code tagging process for Milestone releases
Ref: https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/79Krishna Nikhil VedurumudiKrishna Nikhil Vedurumudihttps://community.opengroup.org/osdu/platform/home/-/issues/51Update Info API to support Core Version and implementation Version2023-03-30T08:11:31ZKrishna Nikhil VedurumudiUpdate Info API to support Core Version and implementation Versionhttps://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/78https://gitlab.opengroup.org/osdu/subcommittees/ea/work-products/adr-elaboration/-/issues/78Krishna Nikhil VedurumudiKrishna Nikhil Vedurumudihttps://community.opengroup.org/osdu/platform/home/-/issues/46Simplify MR process for lib version upgrade2023-02-06T21:36:25ZChris ZhangSimplify MR process for lib version upgrade**Existing Practice / Background**
Currently in PMC MR process, changes to common code require at least 2 CSP teams’ approval before merge. Common Code has been interpreted to include all code in the *-core/ directories, the core common...**Existing Practice / Background**
Currently in PMC MR process, changes to common code require at least 2 CSP teams’ approval before merge. Common Code has been interpreted to include all code in the *-core/ directories, the core common library, and all shared build scripts / dependency lists.
**Motivation**
Many of these reviews would be quick to perform, but they still require a mental context switch for the developers. Thus gathering these approvals can take some amounts of time, due to limited availability of the developers.
However, maintaining a secure system requires that dependencies are frequently and quickly updated. First-party dependencies -- that is, OSDU libraries -- should be updated as soon as possible to keep the system consistent and apply bugfixes across all services quickly. Third-party dependencies may also necessitate quick updates across all services, especially in the case of critical security vulnerabilities. Efficient deployment of these kinds of upgrades is more valuable than the extra reviews.
**Simplified Procedures**
For MRs that only include changes to the library dependencies and minor/obvious code changes to implement the dependency upgrade can be merged by maintainers on the basis of a passing pipeline, without requiring additional approvals.
* “Minor/obvious” code changes include things like changing package names, updating call signatures in ways that do not affect the semantics of the call, etc. The MR author must use their discretion on whether the changes are minor or not; when in doubt, they should seek approval from the other teams.https://community.opengroup.org/osdu/platform/home/-/issues/22Definition of Done for R3 Clarification2022-09-15T23:49:36ZDania Kodeih (Microsoft)Definition of Done for R3 ClarificationWhat is the meaning of "at runtime" in this page: https://community.opengroup.org/osdu/platform/home/-/wikis/Planning/R3/DoneWhat is the meaning of "at runtime" in this page: https://community.opengroup.org/osdu/platform/home/-/wikis/Planning/R3/DoneStephen Whitley (Invited Expert)Stephen Whitley (Invited Expert)