OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2024-01-01T13:07:14Zhttps://community.opengroup.org/osdu/platform/system/reference/schema-upgrade/-/issues/5SeismicAcquisitionSurvey schema upgrade - handling of vessel name2024-01-01T13:07:14ZDebasis ChatterjeeSeismicAcquisitionSurvey schema upgrade - handling of vessel name@vikashoode -
Please see test results here.
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/Test_plan_Results_M22/Core%20Services/M22-Azure-Schema-Upgrade-steps-Debasis.docx
Input schema version 1.0.0 supp...@vikashoode -
Please see test results here.
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/Test_plan_Results_M22/Core%20Services/M22-Azure-Schema-Upgrade-steps-Debasis.docx
Input schema version 1.0.0 supports simple vessel name.
New schema version 1.1.0 supports vessel name under two separate blocks (Source and Receiver configuration).
Your current approach duplicates the value from source record into both source vessel and receiver vessel.
I suggest that you show the value in only one place (ex: source vessel) and leave the other empty.
Also show warning so as to alert the user about this action.
Thank youhttps://community.opengroup.org/osdu/platform/system/reference/schema-upgrade/-/issues/4Provide suitable error message if user attempts "jump" upgrade (ex: 1.0.0->1....2024-01-01T13:00:16ZDebasis ChatterjeeProvide suitable error message if user attempts "jump" upgrade (ex: 1.0.0->1.3.0 for SeismicAcquisitionSurvey)I tried this by mistake. Service response was clean and did not indicate any problem.
However converted record remained the same except schema version showed 1.3.0.
Suggest that you issue meaningful error message and abort the conversi...I tried this by mistake. Service response was clean and did not indicate any problem.
However converted record remained the same except schema version showed 1.3.0.
Suggest that you issue meaningful error message and abort the conversion.
Note from @vikashoode -
Currently, the schema upgrade lacks the capability to directly transition from version 1.0.0 to 1.3.0.
To address this, we are developing a feature that will chain upgrade operations, allowing a step-by-step progression, such as 1.0.0 to 1.1.0, then 1.1.0 to 1.2.0, and finally 1.2.0 to 1.3.0. This approach involves temporarily storing intermediate states in a cache or temporary memory.
However, this feature is still a work in progress. I still need to programmatically determine the number of progression steps involved and handle potential errors in the intermediate upgrade processes.
For the time being, a manual process is required:
Upgrade from 1.0.0 to 1.1.0.
Upgrade from 1.1.0 to 1.2.0.
Upgrade from 1.2.0 to 1.3.0.
This interim solution will be replaced by the chained upgrade feature once it is fully implemented and tested.
Regards,
Vikashttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/issues/318Integration tests. Data partitions are hardcoded2024-03-18T14:19:06ZYan Sushchynski (EPAM)Integration tests. Data partitions are hardcodedHello.
We were adding Google implementation for the service and faced an issue with the integration tests that the data-partition-id values are hardcoded to `opendes` (e.g., [here](https://community.opengroup.org/osdu/platform/domain-d...Hello.
We were adding Google implementation for the service and faced an issue with the integration tests that the data-partition-id values are hardcoded to `opendes` (e.g., [here](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/blob/google_cloud_impl/tests/integration/config.py?ref_type=heads#L119), or [here](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/blob/google_cloud_impl/tests/integration/data_provider.py?ref_type=heads#L48)).
This causes two problems:
1. We do not have a data partition with the name `opendes` and we are constantly getting the [error](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/jobs/2483995#L2)
2. Since we support Multipartition, we want to pass this data-partition-id dynamically
For now, we are going to mark our, GC, tests as allowed to fail in the CICDBryan DawsonSiarhei Khaletski (EPAM)Ernesto GutierrezOlena Holub (EPAM)Bryan Dawsonhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-zgy/-/issues/30reading zgy directly from cloud2023-12-27T10:09:52ZQiang Fureading zgy directly from cloudAny example to read zgy files on cloud from providers by the OpenZGY if it is supported?Any example to read zgy files on cloud from providers by the OpenZGY if it is supported?https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/lib/wellbore-cloud/wellbore-gcp-lib/-/issues/2Add trigger trusted to the CICD2023-12-21T12:07:01ZYan Sushchynski (EPAM)Add trigger trusted to the CICDhttps://community.opengroup.org/osdu/platform/system/storage/-/issues/195Unhandled Exceptions for missing required attributes while creating record2024-03-15T14:13:45ZAnubhav BajajUnhandled Exceptions for missing required attributes while creating recordIssue Currently, the storage PUT endpoint lacks proper error messages if there are missing attributes of in the payload. It shows a generic message which is not informative enough for user to address it: "HV000028: Unexpected exception d...Issue Currently, the storage PUT endpoint lacks proper error messages if there are missing attributes of in the payload. It shows a generic message which is not informative enough for user to address it: "HV000028: Unexpected exception during isValid call,"
Ideally the error message should clearly list out the missing attributes such as 'kind', 'acl', or 'legal'.
Below sample example where acl in null , Response gives generic message.
![image](/uploads/738760ea17a8bce24fc4615d5d26920b/image.png)
Suggestions
• Add cases where these required attributes are null. With relevant error messages like
| Missing Attributes | Suggested Error Messages |
|--------------------|--------------------------|
| Kind | Mandatory fields missing- kind / kind cannot be empty |
| Acl | Mandatory fields missing- acl / acl cannot be empty |
| Legal | Mandatory fields missing- legal / legal cannot be empty |
| Acl and Legal | Mandatory fields missing- acl, Mandatory fields missing- legal / acl cannot be empty, legal cannot be empty |
| Kind, Acl and Legal | Mandatory fields missing- kind, Mandatory fields missing- acl, Mandatory fields missing- legal / kind cannot be empty, acl cannot be empty, legal cannot be empty |https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/122IndexMapping not updated with AsIngestedCoordinates fields2024-01-15T11:55:41ZKonrad KrasnodebskiIndexMapping not updated with AsIngestedCoordinates fieldsCurrent AsIngestedCoordinates feature implementation updates Index mapping regarding what AsIngestedCoordinates fields occur in record. Index mapping mechanism have caching functionality which check whether mapping was synced. This mecha...Current AsIngestedCoordinates feature implementation updates Index mapping regarding what AsIngestedCoordinates fields occur in record. Index mapping mechanism have caching functionality which check whether mapping was synced. This mechanism could disrupt mapping update for various records with the same kind.
Related MR [!650 (merged)](https://community.opengroup.org/osdu/platform/system/indexer-service/-/merge_requests/650)https://community.opengroup.org/osdu/platform/system/notification/-/issues/56Servicebus library upgarde for Notification service2023-12-18T16:22:59ZAlok JoshiServicebus library upgarde for Notification serviceNotification service uses below package for its service bus operations.
```xml
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-servicebus</artifactId>
```
This package is outdated and should be moved to `com.azure:azure-messag...Notification service uses below package for its service bus operations.
```xml
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-servicebus</artifactId>
```
This package is outdated and should be moved to `com.azure:azure-messaging-servicebus` package as [recommended by MSFT](https://mvnrepository.com/artifact/com.microsoft.azure/azure-servicebus)Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/home/-/issues/53[ADR-0009 of wg-data-architecture] Universal_Data_Content_ARRAY_of_Values_API2023-12-18T15:55:30Zjean-francois RAINAUD[ADR-0009 of wg-data-architecture] Universal_Data_Content_ARRAY_of_Values_API## Universal_Data_Content_ARRAY_of_Values_API
* The objective of this ADR is to propose to define a Common API to access all types of optimized storage of Data Content Array Of Values.
* The Information required to specify the behaviour...## Universal_Data_Content_ARRAY_of_Values_API
* The objective of this ADR is to propose to define a Common API to access all types of optimized storage of Data Content Array Of Values.
* The Information required to specify the behaviour of this API should be available from the catalog ( in a shared context)
* This API should be implemented to access the optimized Storage Content Array of Values on the supports provided by the diverse DDMSs (e.g: parquet file, oVDS file Collection, PostGreSQL blobs) or from the Catalog itself.
* The overall objective is to allow a Given DDMS-1 to link the DataValues provided previously by another DDMS-0 on its prefered "DDMS-0 native" support to a DDMS-1 Data Content schema Entity and allows to get the Values directly from the DDMS-0 support without copy it on a DDMS-1 support.
## Status
* [x] Proposed
* [ ] Approved
* [ ] Implementing (incl. documenting)
* [ ] Testing
* [ ] Released
## Context & Scope
Main objective : facilitate the delivery of optimized information gathered in the OSDU platform by Datasets and DDMSs
OSDU aims to be a cross-domain platform. Some core entities like Well and Wellbore are relevant to many domains, which may want to associate domain specific properties with the entities on different DDMSs.
today a solution is presented to associate specific DDMS information to OSDU core entities in https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/blob/master/Guides/Chapters/93-OSDU-Schemas.md#appendix-d34-x-osdu-side-car-type-to-side-car-relationship
Goal 1: "Slim Entities" It is OSDU's goal to keep the shared context relevant to everybody (=domain independent) unambiguous.
Goal 2: "Agile Domains" RDDMS Domains must be empowered to promote change for the benefit of the domain without impacting all other domains. As a consequence schema must be split into,a shared context for interoperability, and a bounded, domain specific context for the domain.
Side-Car Pattern for Schemas The shared context is captured by the 'main', domain-independent entities as schema definition - in the analogy the 'motorbike'. Bounded context by a domain is added by a side-car schema. The side-car entity extension refers to the shared context by id. This is illustrated in the following diagram: see OSDU_ARCHITECTURE/side_car_capture.png
The center column shows the shared context. Generic discoverability is provided via platform services like Search and GIS. Domain specific extensions are defined by the domains independently. Such extensions can use domain driven language, which may be ambiguous outside the bounded domain context. Often domains create their own Domain Data Management Services (DDMS). Such services understand the composition of shared and bounded contexts and can shield applications and users from the complexity of the side-car record implementations. This means that the DDMSs can return the combination of the bounded and shared contexts on queries.
But it could be interesting also to ensure that DDMS 2 can access directly to Data Array Values attached to Core Entities defined in the shared context and generated by a DDMS 1. Because if it is not the case (as today) for an application we can have two situations : The Data array Values should be accessed by the API of the DDMS 1 without any relationship with the DDMS 2 bounded Context. The Application will have to rebuild relationships. The Data array Values should be firstly accessed by the API of the DDMS 1 in the bounded context of the DDMS 2, and this Data Array Values in another "shape" should be copied in the DDMS 2 and attached in its proper bounded context.
Difficulty one : These two use case are not satisfactory and are very common (Seismic DDMS \<-\> Reservoir DDMS, Well DDMS \<-\>Reservoir DDMS, Seismic DDMS \<-\> WellDDMS, RAFSDDMS -\> Reservoir DDMS)
A Difficulty two is the fact that today in the Core entities we do not have "mandatory" informations when we define the Data Array Values properties (types (boolean, integer, float, doubles, string), nb of columns, size of columns). In this case if an Application does not deliver this information in the Catalog it will be impossible to another to read and use it. The interoperability between Applications will be impossible because the DDMSs cannot deliver all content.
this ADR intend to address these two difficulties
## Decision to be made
The share context could take care to define totally how all applications can access Data Arrays of Values on file Data Content and DDMSs Data Content. Data Array of Values are not difficult to describe and we could deliver a Data Array Values API abstraction level which could be used for all file Data Content and all DDMS Data content. Using this abstract API level all Data Array Values of File Data and DDMS Data Content could be accessible by a DDMS which was not at the origin of these Data Array Values.
Note : The information to authorize access to all Data Array values should be mandatory (see description of the method item 3/).
Description of the proposed method to apply if the Data Array values are embedded into an external content (and if we add on a WPC (like welllog) an abstract Colummn Based Table) :
1/ The WPC designed to deliver the data content should have a link to a persistent support (e.g: mentioning a Datasetfile (could be a parquet file), DatasetfileCollection (could be an oVDS collection), uri of a DatasetETPdataspace, urn: etc..)
2/ Inside this dataset persistent support the Data Array of values concerning this WPC will be associated to the id of the WPC : (e.g: "id": "namespace:work-product-component--WellLog:c2c79f1c-90ca-5c92-b8df-04dbe438f414")
3/ And just after inside the WPC the different information attached to the Data Content could be also accessible : "ColumnName", "ValueType" (double, number, string, boolean), "ValueCount" (nb of columns or dimensions), "Column size" (nb of values in the column), it could be more detailed in ValueType (see after : energistics Data type ETP V1.2 documentation) for each "Column name" we should have also: "UnitofMeasureID" + "UnitQuantityId + PropertyTypeID
IMPORTANT WARNING : we should find a way to impose that this information MUST be present in the WPC (e.g: by enhancement of the Validation step during ingestion)
By default no more information is given but this looks enough to proceed.
ex: for WellLog WPC here are the Information to deliver into the Catalog : "id": "namespace:work-product-component--WellLog:c2c79f1c-90ca-5c92-b8df-04dbe438f414" "DDMSDatasets": \[ "[urn://wddms-3/uuid:20840361-adc0-4842-999b-5639bd07bb38](urn://wddms-3/uuid:20840361-adc0-4842-999b-5639bd07bb38)" { "ColumName": "CO2-SAT-Fraction-VP", = "Array meta data in Energistics ETP V1.2" "ValueType": "double", = "DataArrayType in Energistisc ETP V1.2" "ValueCount": 1, "ColumnSize": 7, = "dimension in Energistics ETP V1.2" "UnitQuantityID": "namespace:reference-data--UnitQuantity:unitless:", "PropertyType": { "PropertyTypeID": "namespace:reference-data--PropertyType:8a9930de-6d50-4165-8bcd-8ddf2e6aa7fa:", "Name": "Co2 Volume Fraction" }, },
"Value Type" reference in Energistics ETP V1.2 documentation.
"Energistics.Etp.v12.Datatypes.ArrayOfBoolean", "Energistics.Etp.v12.Datatypes.ArrayOfNullableBoolean", "Energistics.Etp.v12.Datatypes.ArrayOfInt", "Energistics.Etp.v12.Datatypes.ArrayOfNullableInt", "Energistics.Etp.v12.Datatypes.ArrayOfLong", "Energistics.Etp.v12.Datatypes.ArrayOfNullableLong", "Energistics.Etp.v12.Datatypes.ArrayOfFloat", "Energistics.Etp.v12.Datatypes.ArrayOfDouble", "Energistics.Etp.v12.Datatypes.ArrayOfString", "Energistics.Etp.v12.Datatypes.ArrayOfBytes",
We can note that we have today an " existing" method if the Data Array values are not important in size and should preferently be embedded in the catalog. this one is restricted on "value type" to the original list. In this case we should a tag "ColumnValues" with "number" or "double" or "string" or "boolean".
\*\* Now If this information is embedded into the shared context we will be able to access to data Content ARRAY of Values.
On the base of these information we could provide the specification of an API to refers, write and read this Data Content values arrays.
This API could then be used "internally" by all DDMSs to associates these Values to their own Data Content schema ( bounded context). Depending on the context on which the Content Data Array of Values were stored, All DDMS could be able manage these data. On a firts step , each "ingestion DAG" or "DDMSs" could use this information to associate Data Content and shared context.
They could use DataArray specific services to transfer large, binary arrays of homogeneous data values. For example, with Energistics domain standards (see ETP V1.2 protocol 9 page 287 : https://docs.energistics.org/EO_Resources/ETP_Specification_v1.2_Doc_v1.1.pdf), this data is often stored as an HDF5 file.
This API could provide a DataArray transfer which :
* Supports any array of values of different types (boolean, integer, float, doubles, string). This array data is typically associated with a data object (that is, it is the binary array data for the data object).
* Imposes no limits on the dimensions of the array. Multi-dimensional arrays have no limits to the number of dimensions.
* Was originally designed in Energistics standard to support transfer of the data typically stored in HDF5 files but also can be used to transfer this type of data when HDF5 files are not required or used(e.g: parquet files, oVds File Collection, PostGreSQL bulk data, Time series DB)
# Rationale
This proposal is based on experiences gathered by the Energistics standards teams : effective separation between meta data on Array of Values (written in XML or JSON files) and Array of Values (written in binary compressed format).\
All DDMS could discuss with the Catalog at the meta data level on the Content Data Array Values A DDMS can refer a Content Data Array of Values without copying the Data Array of Values and can beneficiate of an optimized access on Array of Values developed by another DDMS.
## Consequences
From a first query on the Catalog, all Data Content Array of Values will be accessible directly of through a more sophisticated DDMS query.
This will not imply a lot of change in the shared context Data Definition side : e.g: update of the abstract Column based Table and add it to all WPC which must handle Data Content Array Values. It is possible that some more data definition effort should be necessary to cover all data content Array of values handled by the diverse DDMS. The API of the DDMSs themselve will not change but the link between the shared Context and each bounded context should be updated. all DDMS should deliver a way to reference, write and read their specific Data Content Array of Values from information contained in the Catalog (shared context)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/221Return the file size as output of the conversion of segy-vds2023-12-14T12:11:11ZDeepa KumariReturn the file size as output of the conversion of segy-vdsLinked issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/segy-to-vds-conversion/-/issues/17#note_261315
We need to be able to capture the file size of the VDS file which is the output of the segy-vds conversion.
...Linked issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/segy-to-vds-conversion/-/issues/17#note_261315
We need to be able to capture the file size of the VDS file which is the output of the segy-vds conversion.
Otherwise there will be an additional call to SDMS service.
However, we'd like to find out if the response could be tailored to add more informationhttps://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/issues/141[ADR] Group deletion validation requirement in enititlement service2024-03-07T12:29:52ZOm Prakash Gupta[ADR] Group deletion validation requirement in enititlement service## Status
* [x] Proposed
* [x] Trialing
* [x] Under review
* [ ] Approved
* [ ] Retired
**Context & Scope**
This ADR is about the Validation of group's usage during its Group deletion API.
**Current Behavior**
- Currently, only val...## Status
* [x] Proposed
* [x] Trialing
* [x] Under review
* [ ] Approved
* [ ] Retired
**Context & Scope**
This ADR is about the Validation of group's usage during its Group deletion API.
**Current Behavior**
- Currently, only validation in place is the access rights of the caller, the caller should be member of Ops and Admin and should be OWNER of the group.
- There is no validation that the group is being used or not. Due to this, it is possible to delete a group that is being used by records, and thus it renders the record in an unusable state.
- Clients have been reporting Ghost records where the owner group of that record was deleted from the entitlement service and even the viewer group. Post deletion of associated groups(ACL) to a record is having no access for view/modification.
**Proposed Requirements**
Delete Groups API should:
- Check which data records are referring to this Entitlement group
- It should fail when there is a record that is referring to this ACL as a **SINGLE OWNER**. Failure response should list the records referring to this ACL(good to have).
- This is required to not leave the record in a ghost state, where it cannot be accessed by anyone since the single OWNER group is deleted.
Complementary feature to support cleanup of ACLS from records:
1. When it is safe to delete the group, i.e. there are no records with this group as SINGLE Owner. Entitlements service can delete the group and publish an event/notifications for the listener services to clean up storage records. This is handled separately via: #161.
**Trade-off Analysis**
These validations are necessary to stop having records with no access either for ownership.
**Challenges**
- Checking for ACLs in so many records: We need a workable solution here, perhaps a search query by legal tag and number of onwers.Om Prakash GuptaOm Prakash Guptahttps://community.opengroup.org/osdu/ui/data-loading/osdu-cli/-/issues/23Issue with Legal tag hardcoded information2023-12-14T10:06:38ZDurga Prasad Reddy NadavaluriIssue with Legal tag hardcoded informationI am attempting to ingest a reference record using the OSDU CLI and have noticed that it utilizes the following part of the schema in the Legal tag section:
```
"legal": {
"legaltags": [
"<Your-legaltag-name>"
],
"ot...I am attempting to ingest a reference record using the OSDU CLI and have noticed that it utilizes the following part of the schema in the Legal tag section:
```
"legal": {
"legaltags": [
"<Your-legaltag-name>"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
}
```
From the above, it is evident that "otherRelevantDataCountries" is hardcoded with "US." However, if we are using a specific legal tag, would it be possible to derive country information from the provided legal tag instead of manually changing it every time based on various legal tags?
Reference Link to code (ingest.py): https://community.opengroup.org/osdu/ui/data-loading/osdu-cli/-/blob/main/src/osducli/commands/dataload/ingest.py?ref_type=heads#L631Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/reservoir/open-etp-server/-/issues/103Build a self-contained binary for open-etp:ssl-client2023-12-12T13:01:17ZYan Sushchynski (EPAM)Build a self-contained binary for open-etp:ssl-clientIt would be nice to have self-contained (statically linked) binaries for different architectures (Windows, Linux, MacOS), which are stored in the Package Registry of the repository, in order to use these binaries without Docker.
This co...It would be nice to have self-contained (statically linked) binaries for different architectures (Windows, Linux, MacOS), which are stored in the Package Registry of the repository, in order to use these binaries without Docker.
This could be useful in a few scenarios:
1. Docker-free Environments: Running `open-etp:ssl-client` on a machine without Docker
2. Integration in CI/CD pipelines: Self-contained binaries would simplify the integration of open-etp:ssl-client in .gitlab-ci jobs or other CI/CD pipelines.
ThanksFabien BosquetLaurent DenyFabiola RiveraFabien Bosquethttps://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/119ADR: Delete API endpoint to delete index for all kinds.2024-01-25T13:21:20ZAkshat JoshiADR: Delete API endpoint to delete index for all kinds.## Status
- [X] Proposed
- [] Under review
- [] Approved
- [] Retired
## Context & Scope
The ADR is centered around the adding the capability of performing the deletion of elastic search index for all kinds per call in existing Delete i...## Status
- [X] Proposed
- [] Under review
- [] Approved
- [] Retired
## Context & Scope
The ADR is centered around the adding the capability of performing the deletion of elastic search index for all kinds per call in existing Delete index API in indexer service.
## Decision
Currently the delete API introduce as the part of this ADR -https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/54 supports only the deletion of the single index for given kind. As part of Replay Design,ADR - https://community.opengroup.org/osdu/platform/system/storage/-/issues/186 , user will may require deleting all the indices in use case of the reindex instead of overwriting the indices. As mentioned in this flow - <br>
![replayAll](/uploads/70dd44c84d985e56148ac84930fa3bd9/replayAll.png) <br><br>
## API Details <br>
**API Level Permission** - users.datalake.ops <br>
**Service** – Indexer
<b>delete API in indexer service</b>.
Sample request:
```bash
curl --request DELETE \
--url '/api/indexer/v2/index' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: opendes'
```
<br><br>
**Current Scenario vs New Scenario of Delete Index API in Indexer Service**
<table>
<tr>
<td><strong> </strong>
</td>
<td><strong> Existing Scenario</strong>
</td>
<td><strong> New Scenario</strong>
</td>
</tr>
<tr>
<td><strong> API Method</strong>
</td>
<td> Delete
</td>
<td> Delete
</td>
</tr>
<tr>
<td><strong>Endpoint supported</strong>
</td>
<td><strong>indexer/v2/index? kind=”tenant1:public:well:1.0.2“</strong>
<p>
</td>
<td>- <strong> indexer/v2/index? kind=” tenant1:public:well:1.0.2“</strong> -it will delete single kind
<p>
- <strong>indexer/v2/index</strong> – It will delete all kinds. (new endpoint)
</td>
</tr>
<tr>
<td><strong>Backward Compatible</strong>
</td>
<td>NA
</td>
<td>Yes
</td>
</tr>
<tr>
<td><strong>New Functionality</strong>
</td>
<td>NA
</td>
<td>It will allow you to delete all the indices.
</td>
</tr>
<tr>
<td><strong>API level change</strong>
</td>
<td>Currently kind should be non-blank parameter
</td>
<td> Will remove nonblank parameter from kind.
</td>
</tr>
<tr>
<td><strong>Code change required?</strong>
</td>
<td>NA
</td>
<td>Yes, backend code change is required to support the deletion of all kinds of indices.
</td>
</tr>
<tr>
<td><strong>API Response </strong>
</td>
<td> Same
</td>
<td> Same
</td>
</tr>
</table>
## Consequences
- This will provide user with the capability to delete index for all kinds.Akshat JoshiAkshat Joshihttps://community.opengroup.org/osdu/platform/system/home/-/issues/110ADR - Project & Workflow Services - Application Integration2024-01-24T13:58:28ZSushil Kumar JhaADR - Project & Workflow Services - Application Integration# Decision Title
This ADR focuses on how applications would integrate with Project & Workflow Services
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The completion of an ...# Decision Title
This ADR focuses on how applications would integrate with Project & Workflow Services
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The completion of an workflow requires the sharing of data across an number of applications. At present, data sharing between applications can lead to the following issues:
- Latest data sets for an project in-progress is scattered in across various unmanaged storage space.
- Data exported by users to an unmanaged storage space to work on is often left outside the managed data store, causing unmanaged growth of storage usage.
- Data saved in personal unmanaged storage space is not available to other users.
- The ability to add notes and annotations on interpreted data so that it can be reference in the future, is missing the current solutions.
- Owner, lineage, audit trail and status of the data in an unmanaged storage space are often unknown.
As applications drive the creation of data, PWS must provide the methods for the application to interact directly with PWS functionality.
This ADR addresses how applications will integrate with PWS.
## Decision
### Collaboration Service, Collaboration Context, Core Services and DDMSs
- A Namespace (Collaboration Context) will have a 1:1 relationship with project.
- A collaboration context is composed of a Project ID, Namespace and a minimum set of ACLs and Legal Tags.
- A collaboration context will be generated by the Collaboration Service.
- Data Platform Core APIs services including Search, Storage, Index and Notifications will be extended to take into account the Collaboration Service and Collaboration Context (for example, the Storage API will be extended to leverage the Shadow Record pattern to write to the Collaboration Project Data Collection).
- DDMSs will need to be updated to take into account the Collaboration Service and Collaboration Context.
- Notifications will be triggered by the Collaboration Service and the logic for when and how the notifications will be triggered is TBC.
### Application interactions
- Applications will use the Collaboration Context to read/write data to a Collaboration Project Data Collection Project (CPDC).
- Applications are expected to write data back to a Collaboration Project Data Collection when WIP data is ready to share with project team members or when the data is ready to be published. The expectation is not for an application to write every change or update to the Collaboration Project Data Collection.
- If an application uses a data store outside of OSDU (i.e., offline), the data only needs to send to a Collaboration Project Data Collection when ready to be published or shared.
- When an application reads data from a Project Collaboration Data Collection, it must be able to read and maintain the metadata associated with the file.
- When an application writes data back to a Collaboration Project Data Collection, it must be able to maintain the 1/ existing metadata, 2/ add new metadata to the file, 3/ maintain the lineage and 4/ the legal tags
## Business Process Overview - Block Diagram
![Business_process_view](/uploads/50f1c6bcc22b0529f5b5c92b4f83fadc/Business_process_view.png)
Before we get into the details of each block of the diagram, it is important to define Work in Progress (WIP) data as it the core of the design.
## Definition of Work in Progress data
The OSDU P&WS service uses a different approach by clearly defining System of Record data and Work in Progress data and keeping it in the same storage system. Differentiation between SOR data and WIP data is through the addition and completeness of metadata for new generated data in a technical workflow. How is this supposed to work?
Referring to the numbering in figure 1 the following steps illustrate the proposed approach:
1. Every Collaboration Project (CP) starts with an Initiation phase (yellow box 1) that includes the generation of an empty Collaboration Project Data Collection (CPDC - 1.2)
2. A techncial workflow starts with a selection of input data from the System of Record (2.1) that are added to the CPDC.
3. Newly generated data in a workflow is by default initially Work in Progress (WIP) data. *(In a technical workflow it should be very easy and frictionless to generate new data. Not all data generated however will be useful to keep and store as a record. Therefore,we should not automatically define new data as a record)*
4. Applications that generate this WIP data interact via the CP Data Management Service. Multiple versions can be created not inhibiting the creativity of the users. The CP Data Management Service offers the full CRUD (Create, Review, Update, Delete) functionality for WIP data (2.3). It is expected that WIP data gets lineage and other relevant metadata added by the applications automatically, but it will offer a functionality for this as well as part of the CP Publication services (3).
5. The publication services are an essential part of the P&WS concept. When WIP data is ready to be passed on to other workflows or applications the requirement will be that this can only be done after this selected WIP data is first declared a record. (3.1.1) This to ensure that every workflow uses the authoritative SOR.
6. Generated WIP data has to become authoritative by adding assurance metadata as the mechanism for this differentiation. Assurance meta data defines the trust level. OSDU allows this to be generic as well as being specific with additional assurance labels for what purpose the data can be used and for what purpose it cannot be used. (3.1.3)
This means that the P&WS must impose minimum metadata requirements before WIP data can be published to the SOR. With services provided to connect with the assurance metadata framework to label WIP data as SOR data the option exists for applications to use the P&WS as a mechanism to enhance data labeling before it is ingested into OSDU, provided they connect to a defined Collaboration Project.
After selected WIP data has been assured and published to the SOR there will still be other WIP data in the CPDC. At any time it will be
possible to delete WIP data, but if that is not done by a user a non-record disposal (NRD)mechanism is possible to develop as well. The P&WS will not allow for any fucntionality to delete SOR data. That data is in line with OSDU data principles by default immutable.
Based on the initial definition of the duration of the project it will be possible to set-up a time bound notification if WIP data can be deleted or purged. When the duration is extended this notification system is adjusted as well. Or company policies can set this NRD time window.
# 1 Initialise Collaboration Project
A Project Admin triggers this process. There are mainly two sub-processes here, and OSDU needs to support this process by providing APIs that enable these sub-processes
## 1.1 Create Collaboration Project (CP)
The CP is a data type that comprises the top level config information about the Collaboration Project. The key config information is below:
* Default ACLs and Legal Tags (LTs). During the lifetime of the CP, 1000s of temporary datasets may be created. Assigning an ACL and LT individually to each of them would be tedious. Also, as most of this data is temporary, it may also be overkill to do so. Therefore, a Default ACL and LT is specified at the CP level. Then, when the temp data is created, these ACLs/LTs will be auto-assigned to that data unless an ACL/LT is explicitly supplied during data creation
* Scope, Objective, Timeline: these are standard characteristics of projects in general, and are also relevant to a CP, as they help define why the CP exists
* Status: Can be *Open* and *Closed*
* CP ACLs. In contrast to the above, this ACL contains the users that are allowed to access to the CP. For instance, there would be a certain set of users that are allowed to create data within the CP, another set of users that are allowed to manage the CP itself, such as triggering the Publish process or Delete data. Currently, we need at least a “Project Admin” ACL and a “Project Contributor” ACL.
In this process step (1.1), we need APIs that support the creation of this CP.
## 1.2 Create Collaboration Project Data Collection (CPDC)
This process creates the CPDC which is the key container for data within a CP. It consists of two sections:
* References to SoR: This is a list of record references to data from the SoR. The CP is not allowed to modify these records, though it can generate new versions of these records and store those within the WIP data section
* WIP Data: also known as temp data, this is the set of references to data that is created within the CP. Normally these data references should only be visible within the same CP and not outside
In this process step (1.2), APIs should support the creation of the CPDC along with the two sections mentioned. These sections will continuously be modified (i.e. records added into it) as the project goes on.
It is expected that a single CP will have only a single CPDC – however, that assumption should **not** be baked into the design/implementation as a limitation.
# 2 Execution
Most of the CP’s lifecycle will be spent in this process. There are three subprocesses, which may occur in any order through the course of the CP. To support this, the existing APIs in OSDU (Storage, Search, Notification) need to be modified.
## 2.1 Add SoR References
During the CP’s lifetime, Project Contributors may add new SoR References (including references to records in DDMS) at any time. To avoid additional complexity, removal of SoR References is not considered right now. It is assumed that a SoR reference, once added, remains in the CPDC until the end of the CP.
If another *Open* project has already added the same SoR, then the ID of those CP(s) should be added to the API response. Alternatively, a Notification can be raised to flag this fact. Project teams may then choose to coordinate as needed, though this part will (currently) lie outside the knowledge/control of OSDU.
This API is only callable while the CP is in the *Open* state. Calling it on a *Closed* CP throws an error
## 2.2 Search CPDC
It is important that functionality exists to catalog and search all the WIP data in a CP. The current approach is that the CP Data Management Service includes a cataloguing function for all this WIP data.
The standard Search API needs to be modified, to take the ID of the CPDC as an optional parameter. If this parameter is supplied, the Search API needs to narrow its search scope to include only the contents of the CPDC (both sections of it)
If the parameter is not supplied, then the search will include only the contents of the SoR. Any records that are in the WIP section of any CPDC, will be excluded from the search scope.
## 2.3 Manage WIP Data
The standard Storage API needs to be modified, to take the ID of the CPDC as an optional parameter. If this parameter is supplied, the Storage API needs to do the following:
* For write API requests (only callable by CP Data Contributors), write record references into the CPDC, and populate ACLs and LTs. If ACL/LT was supplied in the API request, apply that, if not then pick the default ACL/LT from the CP config and apply it to the record. This API is only callable while the CP is in *Open* state. Calling it on a *Closed* CP throws an error
* For read API requests, the current status quo behaviour should still work, as it is simply a read of the record based on the ACL/LT mentioned on the record
* There should also be a Hard Deletion API that hard deletes a WIP record and associated data files from blob storage – as long as no descendant data records exist. If descendants exist, then the hard delete should throw an error. This API is only callable while the CP is in *Open* state. Calling it on a *Closed* CP throws an error
* Some operators may prefer to have an “archival” to a colder storage tier instead of hard-delete, this choice should be possible via the API.
If the CPDC parameter is not supplied, then the current status quo behaviour will apply.
# 3 Publish
Once all the needed work is accomplished in a workflow, we are left with artifacts in WIP section of CPDC which either needs to be published immediately or at a later date or discarded. The “Ref to SoR” part of CPDC is already in SoR and nothing needs to be done for them.
The Publish process can occur several times during the life of a project.
## 3.1 Prepare to publish
3.1.1 Select WIP datasets which needs to be published.
3.1.2 The Publish Svc needs to recursively identify the predecessor records based on the selected WIP datasets.
3.1.3 Each of those datasets needs to go through assurance and QC process. Assurance will be done using the Assurance Model by a separate app. From a PWS perspective, we need to know if the record(s) are assured or not. Outcome of this process is a “Ready-To-Publish List”.
3.1.4 Now that these datasets will end up in SoR, so we need to assign right ACLs and legal tags to them. This is needed because, it may be the case that while working on them when they were part of WIP, a default or simple ACL or legal tags were attached.
## 3.2 Detect conflicts
We do have two kinds of datasets in WIP – 1. Newly created datasets with no linkage to SoR 2. Modification of SoR datasets which created a new version of it, or WIP datasets derived from an SoR dataset
For the first one, as they are new datasets, there won’t be any conflicts and are ready to be published to SoR
For the second one there can be few implications:
* We started with a SoR dataset and made some modifications to it. We did this while working on it in our workflow which is part of our collaboration project. This created a new WIP item in CPDC. But there may be a case that the parent SoR was also picked in some collaboration project and is also modified in that project thereby creating new WIP item in their CPDC. This will result in a conflict between these two WIP items as they are derived from same parent and needs to be resolved.
* There can also be a scenario where we started with a SoR and made some modifications to it in our collaboration project thereby creating new WIP item in CPDC. But, in the meanwhile, the parent SoR is modified out collaboration project and a new version is available in SoR. Again, there is a conflict for our WIP item that needs to be resolved before publishing it as a newer version of parent SoR.
So, as an outcome of this step, there will be a “Conflicts list” which contains some WIP items which needs to resolve before they can be published to SoR.
## 3.3 Resolve Conflicts
There can be different way of dealing with conflict list:
* Ignore the conflict and publish. We need to see if this should be allowed, or guardrail should be in place in block publishing datasets which have conflicts.
* Use notifications to inform about conflicts among involved parties and resolve them manually.
* Use tools/automations scripts to resolve conflicts automatically when possible. We do not foresee this for initial MVP releases. If this is really needed in future, we can work on it.
Eventually, the conflict needs to be resolved so that these items are also ready to be published in SoR
## 3.4 Update SoR
Once all the items are ready to be published, they are published to SoR by removing the namespace tags from those items.
# 4 Project Closure
This is triggered by the Project Admin when the CP is to be closed. Consists of 3 sub-processes, which need APIs:
## 4.1 Publish data to SoR
This is already covered in Section 3
## 4.2 Delete remaining WIP Data
After all required data has been published via sub-process 4.1, the next step is to hard delete all remaining WIP data. This should reuse the same hard deletion API mentioned in sub process 2.3
## 4.3 Update Project Config
Clean up the CP by updating the status of the Project to “Closed”. Also generate a CP Closed notification. Also empty the CPDC by either removing all WIP and SoR References or set them to a status such as “Closed” so that other CPs no longer consider them for conflicts. Or, set the CPDC to be read-only
---
# Appendix A: When to update PWS
Applications do not necessarily have to save every single record/data update into PWS. For reasons of efficiency, cost, and performance, applications or users may choose to ingest data to PWS only when the data has attained a certain level of maturity and/or is ready to be shared with other users/applications.
# Appendix B: Offline Applications
For the purposes of this section, an “Offline” App is one that participates in a business workflow and handles temporary data, but does not interact with the PWS or with OSDU as a SoR/System of Engagement. As a result, PWS has no knowledge of or control over the data or processing done by such applications.
This situation can exist for various reasons, such as the app not supporting integration with OSDU (due to technical or other limitations) or because OSDU does not have the necessary data types or APIs for this.
Such applications can and will function outside the purview of OSDU and PWS. It will be a decision of the operators whether and how such data can be pushed back to OSDU. If the operator chooses to do so, they would use the same APIs described above to manage that integration, and OSDU/PWS would only be aware of the workflows as long as the data is managed within PWS. To facilitate this process, operators can consider an Anti-Corruption Layer or Façade Layer which mediates between the application and OSDU; these in turn may be custom built or marketplace solution. This layer would need to adhere to OSDU standards (eg: REST API integration) to integrate with PWS.
Below is a diagram which depicts how an anti-corruption layer can facilitate the communication between an offline app and P&WS:
![Anti-Corr-Layer-PWS__1_](/uploads/c3fedf21120ad4c88ad34d2fb11371cb/Anti-Corr-Layer-PWS__1_.PNG)
# Anti-Corruption Layer components
## Facade
Offline applications can be of different architecture types and may have different way of storing and sharing their application data. So, we need a facade layer which wraps the functionality of offline apps. This helps the offline apps to connect to anti-corruption layer.
## Adapter
The adapter component is responsible for converting the offline app domain model to OSDU domain model. This will include scema mapping. It can also have functionalities to add missing metadata to the application datasets before they can be added to CPDC. So, the adapter is both application and OSDU aware and helps converting application data to the form acceptable by P&WS. The adapter may need a translater component for scema mapping or other translations
## Translator
The translator component can be used to translate from offline application domain model concepts to OSDU model concepts. This can include scema mapping and other needed translations.
## API Mapper
APImapper layer is used to call needed OSDU apis from anti-corruption layer. So, we do not implement any new api in anti-corruption layer for OSDU.
## Sequence Diagram
---------------
TO_DO...
1. add next layer of details to the diagram e.g., API calls
2. draft diagrams for 1/ publish to SoR and 2/ deletion of data from the collaboration context
![ADR7_SD_-_1.0_Initialise_Collab_Project](/uploads/318bd453796ff5f0f113f62644b4c515/ADR7_SD_-_1.0_Initialise_Collab_Project.png)![ADR7_SD_-_2.1_Add_SoR_Reference](/uploads/b6f371bda7c66ce0188c9b42429c589d/ADR7_SD_-_2.1_Add_SoR_Reference.png)![ADR_7_SD_-_2.2_Search_CPDC](/uploads/338c8344d61798904e919442e22b7aa5/ADR_7_SD_-_2.2_Search_CPDC.png)![ADR_7_SD_-_2.3_Search_WIP_Data](/uploads/da5e9544a3f0db9a479eda25822ab236/ADR_7_SD_-_2.3_Search_WIP_Data.png)![ADR_7_SD_-_3.0_Execute](/uploads/0b8c207cb6a6a0ee0c4402855e233550/ADR_7_SD_-_3.0_Execute.png)
## Rationale
## Consequences
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinehttps://community.opengroup.org/osdu/platform/system/home/-/issues/109ADR - Project & Workflow Services - Core Services Integration - Copy Record r...2024-03-19T02:00:02ZSushil Kumar JhaADR - Project & Workflow Services - Core Services Integration - Copy Record references between namespaces# This ADR focuses on copying record references between namespaces
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Background
In our previous [ADR](https://community.opengroup.org/osdu/platf...# This ADR focuses on copying record references between namespaces
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Background
In our previous [ADR](https://community.opengroup.org/osdu/platform/system/storage/-/issues/149), we looked at creating namespaces for storage records. This allowed us to separate out record changes in the system of record and system of engagement. However we do need to be able to move changes between the system of record and engagement.
This ADR builds on the previous one by proposing to add a new API into Storage service to enable this.
As outlined in the previous ADR each namespace holds the reference to the Record versions it holds for a that Record ID. The API will move the Record references held in a specific namespace into a provided target namespace.
## Out of scope
For this ADR we are only looking at how we can copy references between namespaces.
We are not deciding on
- How conflicts will be handled when the destination namespace has a newer version that already exists.
- How collaborations will act on this or control this behavior or even what a collaboration entity looks like
## Solution
The below diagram shows how the reference system works for a specific Record ID that exists both in a collaboration and the system of record.
![collaboration ingestion.drawio (10).png](./media/Collaboration-Ingestion.png)
The metadata object of the Record holds the reference to the specific versions that exist to that context. So even though other versions may exist in different contexts they are not accessible outside the context(s) referenced by it.
This new API will take a source context, target context, Record ID and Record version. It will attempt to copy the given reference from the source into the target.
For example in the above diagram, I could request to copy the V4 data object from collaboration 1 into the system of Record metadata (promoted) using the following API call
```bash
curl -X 'PUT' \
'<osdu>/api/storage/v2/records/copy' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--header 'x-collaboration: id=<source-collaboration-id>,application=<app-name>;' \
--data '{
"target": ""
"records": [{
"id": "<record-id>"
"version": "<record-version>",
}
]}'\
-- data-raw
```
Finally you can copy data from the system of record to a specific collaboration by not providing an x-collaboration `id` directive value and supplying a target.
```bash
curl -X 'PUT' \
'<osdu>/api/storage/v2/records/copy' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--header 'x-collaboration: application=<app-name>;' \
--data '{
"target": "<target-collaboration-id>"
"records": [{
"id": "<record-id>"
"version": "<record-version>",
}
]}'\
-- data-raw
```
The new API should require the `services.storage.admin` permission to run as it is a privileged operation to move data between contexts.
Here is the full API specification of the new API.
```yaml
/records/copy:
put:
tags:
- Records
summary: Copy a Record reference form one namespace to another
description: This API attempts to copy all the Record references it is provided from the given source namespace to the target namespace. All refences will be copied or all will fail as a transaction. IF the target namesapce does not et exist it will be created. It requires 'services.storage.admin' permission to call
operationId: copyReference
requestBody:
description: The references to copy
content:
application/json:
schema:
$ref: '#/components/schemas/copyReferences'
required: true
responses:
'200':
description: Successful operation
content:
application/json:
schema:
$ref: '#/components/schemas/copyReferences'
'400':
description: Invalid Record IDs provided
'404':
description: Records not found
'409':
description: One or more references already exist in the target namespace
```
## Context & Scope
## Decision
## Rationale
## Consequences
We will have a new API where the data access layer needs to be implemented by every CSP.
The hard delete API in storage service needs to add extra validation that the data blob being deleted from storage is not referenced in a different context. With this API the same version of data can be referenced in multiple contexts meaning a purge could delete a blob referenced in a different context and leave a dangling reference without this extra check.
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinehttps://community.opengroup.org/osdu/platform/system/home/-/issues/108ADR - Project & Workflow Services - Core Services Integration - Search Servic...2024-03-19T04:00:03ZSushil Kumar JhaADR - Project & Workflow Services - Core Services Integration - Search Service Support# This ADR focuses on search service support for enabling Project & Workflow Services
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
Following on from [this ADR](https://co...# This ADR focuses on search service support for enabling Project & Workflow Services
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
Following on from [this ADR](https://community.opengroup.org/osdu/platform/system/storage/-/issues/149) that brought the idea that the same record instances could live in multiple `namespaces` or `collaborations` at the same time Search service also needs to adopt the same feature.
Like the previous ADR, this needs to be implemented in a non-breaking way that can be release controlled from a feature flag.
## Index Solution
Record changed V2 is already being published from Storage service when the collaboration feature is enabled.
Using the same feature flag the Indexer service should start to consume the Record Changed V2 event if it is enabled **instead** of Record Changed V1. The ID of the generated document in Search backend (Elasticsearch) should then be a combination of both Record Id + Collaboration ID on the message.
We will also add a new collaboration property to indexed documents that has the ID of the collaboration the document belongs to (if any). This can be used for search queries, however it **should not be** returned to the user by default as it is not a part of the Storage Record in SoR.
Example Record change V2 message
```json
"message": {
"data": [
{
"id": "opendes:inttest:1674654754283",
"kind": "opendes:wks:inttest:1.0.1674654754283",
"op": "create",
"version": 1673284431169293,
"modifiedBy": "projnwrkflwssvs@osdu.org"
}
],
"account-id": "opendes",
"data-partition-id": "opendes",
"correlation-id": "2715a1b8-2ffb-406f-839c-6e6bfed27e5c",
"x-collaboration": "id=abcd-12345-efghij-67890-klmn"
}
```
Elasticsearch document
```
ID: <id + collaboration-id>
Collaboration: <collaboration-id>
```
This allows multiple instances of the same Record ID to exist in Search, 1 per collaboration.
## Search solution
Search service should support the new `x-collaboration` header defined in the original [Storage Service ADR](https://community.opengroup.org/osdu/platform/system/storage/-/issues/149#collaboration-context-header) when the feature flag is enabled.
If a collaboration `id` property is defined on a search request, Search service will automatically add that filter to the query meaning only documents in that collaboration can be returned to the user.
Example searching for all records not assigned to any collaboration (same as current behavior)
```bash
curl -X 'POST' \
'<osdu>/api/search/v2/query' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--data '{ kind: *:*:*:*, query: "", limit: 10
}'\
-- data-raw
```
Example search for all records assigned in collaboration `abcd-12345-efghij-67890-klmn`
```bash
curl -X 'POST' \
'<osdu>/api/search/v2/query' \
-header 'data-partition-id: opendes' \
--header 'authorization: Bearer <JWT>' \
--header 'Content-Type: application/json' \
--header 'x-collaboration: id=abcd-12345-efghij-67890-klmn' \
--data '{ kind: *:*:*:*, query: "", limit: 10
}'\
-- data-raw
```
## Decision
## Rationale
## Consequences
- This is a non breaking change.
- All features are enabled via a feature flag.
- Search service will start to optionally use the `x-collaboration` header already defined to scope requests to specific collaborations.
- Indexer can store the same Record ID multiple times, once per collaboration.
- Indexer service's Re-index API needs to conform to the Record Changed V2 format when the feature is enabled.
- Indexer service's Index clean-up API should remove records when collaboration context is provided.
- Indexer-queue service's record change event processor should conform to Record Changed V2 format.
## Open Questions:
- Indexer service should forward `x-collaboration` header to all Storage service requests as it needs to index records in specific collaboration but it also have dependency on Schema service, do we expect schema service to honor `x-collaboration`? Do we expect different schema in different collaborations?
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinehttps://community.opengroup.org/osdu/platform/system/home/-/issues/107ADR - Project & Workflow Services - Core Services Integration - E&O2024-01-24T13:58:14ZSushil Kumar JhaADR - Project & Workflow Services - Core Services Integration - E&O# This ADR focuses on E&O rules from Project & Workflow Services point of view
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
For Entitlements and Obligations, we have two o...# This ADR focuses on E&O rules from Project & Workflow Services point of view
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
For Entitlements and Obligations, we have two options:
1. **Coarse-grained access** i.e. all users in a given Collaboration get the same level of access to all the data in the Collaboration. It is possible to have a set of "Viewer" users and "Owner" users who are allowed to Create/Update/Delete.
2. **Fine-grained access** i.e. full-fledged PBAC and RBAC on individual record level.
Option 1 will likely be the most appropriate for the PWS, as there are likely to be a relatively small number of users in a given Workspace. Also, during the course of the project, many temporary datasets will get created; managing the ACLs or Legal Tags (as in Option 2) of each will become cumbersome and superfluous.
### Ingestion
This requires an "inheritance" notion for E&O to records from the parent Collaboration. Authorizations are currently implemented using ACLs or Legal Tags in records. To support E&O for PWS, we propose the following:
- Each Collaboration must have Legal Tags, Viewer ACL and Owner ACL assigned to it. This will have to be done during the creation of these entities. This also implies the need of APIs for the management of Collaboration
- To ingest data into a Collaboration, the `x-collaboration` header should be present in the request
- If the Record creation API sees the above header (implying the record belongs to a Collaboration), it must propagate the ACLs and LTs of the Collaboration, to the new record
Alternatively, we could adopt a dynamic approach where the ACLs/LTs of the Collaboration are checked whenever the Record is accessed. However, this is a much more far-reaching change that will impact a very large number of services. The above described simpler approach may be sufficient in our case as this data is of temporary value. The trade-off for this simplicity is that once the ACLs and LTs are assigned to the Collaboration, they cannot be modified.
### Promotion
During Promotion or Publishing of Records to the SoR, the Ingestion process will need to assign ACLs and LTs afresh, because the ACLs and LTs assigned during project lifecycle may not be appropriate in the SoR.
## Decision
## Rationale
## Consequences
- This is a non-breaking change
- As in the [original ADR](https://community.opengroup.org/osdu/platform/system/storage/-/issues/149) this functionality is enabled via the feature flag
- The ACL or LT of the Collaboration cannot be changed after creation
- Indexer-queue service's record change event processor should conform to Record Changed V2 format
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinehttps://community.opengroup.org/osdu/platform/system/home/-/issues/106ADR - Project & Workflow Services - Core Services Integration - Collaboration...2024-01-24T13:58:11ZSushil Kumar JhaADR - Project & Workflow Services - Core Services Integration - Collaboration Service# This ADR focuses on collaboration service which is a key component for Project & Workflow Services.
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The Collaboration Servi...# This ADR focuses on collaboration service which is a key component for Project & Workflow Services.
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The Collaboration Service is needed to be the source of truth of what collaborations exist in a partition and to hold specific configuration for that collaboration e.g. what applications can access it, what frame of reference do they want to use, what alerts do they need etc.
## APIs
The service will act as a key value look up service for a given collaboration ID. It can list the the available collaborations e.g.
```http
GET /api/collaborations-svc/v1/collaborations HTTP/1.1
```
```json
[
"44771c69-89c1-4552-b038-9f596071c23e",
"69771c69-89c1-4678-b038-9f596071c44e"
]
```
Add in new collaborations and the configuration they hold
```http
POST /api/collaborations-svc/v1/collaborations/69340rt6-89c1-4678-b038-9f596071c44e HTTP/1.1
{
"key1": { },
"key2": { },
}
```
retrieve the collaboration configuration information
```http
GET /api/collaborations-svc/v1/collaborations/69340rt6-89c1-4678-b038-9f596071c44e HTTP/1.1
```
```json
{
"key1": { },
"key2": { },
}
```
And update the collaboration configuration information
```http
PATCH /api/collaborations-svc/v1/collaborations/69340rt6-89c1-4678-b038-9f596071c44e HTTP/1.1
{
"operation": "add"
"patch": "key3"
"value": "{ }"
}
```
## Usage and Performance
One of the key use cases is for us to validate a given collaboration ID exists when provided on a request.
As per the design we want this to be done on behalf of the services in the OSDU® Data Platform at the infrastructure level using Istio or equivalent.
To not incur significant overhead the service mesh should cache the result for reuse on all subsequent requests.
We should make use of HTTP cache semantics to improve performance and resiliency on this point of failure.
```
cache-control: private, max-age=300, stale-while-revalidate=600, stale-if-error=600
vary: data-partition-id
```
For example if we apply the above response on the list collaborations API of this service
```http
GET /api/collaborations-svc/v1/collaborations HTTP/1.1
```
The service meshes http client should cache the response on subsequent requests for 5 minutes. It should also use the cached response if there is an error retrieving an refreshed result from the service for 10 minutes.
This has the downside of not being able to use a newly added collaboration for up to 5 minutes in the OSDU® Data Platform. However collaborations should be rarely created and unlikely to be used straight away and so this should be acceptable.
## Decision
## Rationale
## Consequences
We will have a new core service that needs to be implemented by every CSP.
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinehttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/issues/88Option to provide TVD-based log data by using Trajectory data2023-12-11T01:44:08ZDebasis ChatterjeeOption to provide TVD-based log data by using Trajectory dataAssume both trajectory station data and well log curves have been properly populated as "optimized content" (Parquet).
Do you think it is reasonable to expect a new API end-point to present TVD-converted log data?
It would be important ...Assume both trajectory station data and well log curves have been properly populated as "optimized content" (Parquet).
Do you think it is reasonable to expect a new API end-point to present TVD-converted log data?
It would be important to provide conversion algorithm information too.
Similar to how there is provision in CRS conversion (Spatial block).
cc @deny (Please add more details as required)