OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2021-02-22T21:00:13Zhttps://community.opengroup.org/osdu/platform/system/partition/-/issues/6Partition Service to Support Multiple Data Partitions2021-02-22T21:00:13ZAn NgoPartition Service to Support Multiple Data Partitions**Overview**
As part of the effort to enable multiple data partitions in OSDU, there is a need for a Partition Service that is responsible for creating and retrieving the partition specific properties on behalf of calling services. The ...**Overview**
As part of the effort to enable multiple data partitions in OSDU, there is a need for a Partition Service that is responsible for creating and retrieving the partition specific properties on behalf of calling services. The service will encapsulate the data currently held in the secrets store and the "tenantinfo" datastore. Encapsulation in this case is referring to isolation via the service interface vs. any implication of where or how the secrets data, in particular, is physically stored.<br><br>
The Partition Service will be the means of decoupling services from the logic of partition creation/info retrieval/deletion and will be fairly generic, i.e. essentially a means to store and retrieve key/value pair information relevant to data partitions. As a service rather than a client library, the Partition Service also provides a logical point to implement features related to performance and scalability. Additionally, the Partition Service will be language independent and available to all services without a separate implementation for each language family.
**Details**
For the Azure implementation of Partition Service, the following should hold:
+ This will be a service that can only be accessed by any other services within the (multi-cluster) service mesh.
+ It will have a 5-minute TTL on GET data partition info responses
+ It will allow dirty reads if TTL has expired but can’t be updated by the client
+ This should be implemented to the same standards as other OSDU services (technology stack, SPIs etc.) but with an Azure implementation first. This forces the interface of the API (partitionInfo) to be more generic.
+ All data will be stored in Azure Key Vault whether this is secret or not. This means no partition data in CosmosDb.
+ Service principal credentials are needed to access create and delete APIs. Get API should only be accessible within the cluster and has no public access outside the cluster.M1 - Release 0.12021-02-26https://community.opengroup.org/osdu/platform/security-and-compliance/home/-/issues/48Policy Service Based E&O Testing Guide2022-08-23T11:18:45ZAsh SathyaseelanPolicy Service Based E&O Testing GuideHrvoje MarkovicHrvoje Markovic2021-06-09https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/issues/135Tenant deprovisioning API2023-11-02T04:28:51ZHimanshu KumrawatTenant deprovisioning API# Problem Statement
Currently Entitlements API’s in OSDU lack the direct functionality of clearing all the entitlements respective to a partition. There’s no dedicated API exposed which can perform exhaustive cleaning or removing all th...# Problem Statement
Currently Entitlements API’s in OSDU lack the direct functionality of clearing all the entitlements respective to a partition. There’s no dedicated API exposed which can perform exhaustive cleaning or removing all the entitlements. There is a gap in terms of functionality for the use case in which all the groups, including bootstrapped groups, and all their members can be removed in one go. There is no API which does the opposite to what the tenant-provisioning API does. There are some endpoints exposed to get the desired functionality iteratively deleting each group and their respective members but there is no possible way to clear out all the entitlements for a partition using a single endpoint. The delete group API only clears one group at a time. The delete member API cleans all the association of the user from associated groups and then removes the member.
Therefore, the users and CSPs, when required, cannot clear out all the entitlements of a partition and to bridge this gap there’s a need to extend the set of APIs exposed currently by adding one with the capability of cleaning the entitlements end to end for a respective partition.
Adding to the requirement the availability of the new functionality will help all the CSP’s to efficiently manage the entitlements on data partition level.
# Rationale behind the proposal
In future if OSDU wants to add support for deleting the data partition this feature will be very helpful in its implementation. While cleaning the resources corresponding to a data partition being deleted all the entitlements can be removed with a call to single API.
# Proposals
## API Design:
A new entitlement API is proposed to provide the tenant de-provisioning functionality. This API shall be capable of removing all the entitlement groups of the partition internally by removing the data from the CSP respective databases. This API should be deleting all the groups including bootstrap groups and deleting all the members of the respective groups also. There should be complete disassociation of members belonging to the data partition given as input to API. On completion of deletion activity, the rest-API need to provide with appropriate status-code for success or failure.
The user with valid access token is permitted to call the API. The permitted token needs to be a valid admin app token only, no other token shall be allowed to access the API restricting the admin users to perform the cleaning of the entitlements. This restriction is like the authentication in case of tenant-provisioning allowing only the OSDU admin to perform the critical operations. The new endpoint can be same as tenant-provisioning update with delete method.
### API Signature:
HOST URL: {{endpoint}}/api/entitlements/v2/tenant-provisioning
headers = {
'data-partition-id': data-partition-name,
'Content-Type': 'application/json',
'Authorization': Bearer \<token\>
}
method = 'DELETE'
The above API can be implemented by providing definition to the following interface associated with the controller for the tenant deprovisioning. The deprovisionTenant functionality must clear all the entitlement’s groups and all members before returning.
package org.opengroup.osdu.entitlements.v2.service;\
\
public interface TenantDeprovisioningService {\
_/\*\*\
\* In case of unexpected error all changes made are reverted. \
\*/_
void deprovisionTenant();\
}
### Api Controller
Following controller change need to be added to make use of the definition provided as follow:
@RestController\
public class InitApi {\
\
@Autowired\
private TenantInitService tenantInitService;\
\
@Autowired\
private TenantDeprovisioningService tenantDeprovisioningService;
@PostMapping("/tenant-provisioning")\
@PreAuthorize("@authorizationFilter.hasAnyPermission()")\
public ResponseEntity\<InitServiceDto\> initiateTenant(@RequestBody(required = false) InitServiceDto initServiceDto) {\
tenantInitService.createDefaultGroups();\
tenantInitService.bootstrapInitialAccounts(initServiceDto);\
return new ResponseEntity\<\>(initServiceDto, HttpStatus._OK_);\
}\
\
@DeleteMapping("/tenant-provisioning")\
@PreAuthorize("@authorizationFilter.hasAnyPermission()")\
public ResponseEntity\<Void\> deleteTenant(@RequestBody(required = false) DeleteServiceDto deleteServiceDto) {\
tenantDeprovisioningService.deprovisionTenant();\
return new ResponseEntity\<\>(HttpStatus._NO_CONTENT_);\
}\
}
## Flow Diagram
![image.png](/uploads/7bf2f2df216a19377fb2a862889676a5/image.png)
## FAQ
1. The delete API will be a synchronous call and can this be done in a few seconds?
A. Yes, it will be a sync call only, and will be completed within seconds.M21 - Release 0.24Himanshu KumrawatHimanshu Kumrawat2023-09-11https://community.opengroup.org/osdu/platform/system/project-and-workflow/-/issues/82Draft F2F Edinburgh Presentation Work2024-03-25T15:02:22ZRikesh ChauhanDraft F2F Edinburgh Presentation WorkDescription
---
Draft for the F2F Edinburgh presentation.
Acceptance criteria
---
Make sure draft is complete of the presentation then get inputs from others involved. Deadline for draft presentation to be uploaded to Gitlab is Monday...Description
---
Draft for the F2F Edinburgh presentation.
Acceptance criteria
---
Make sure draft is complete of the presentation then get inputs from others involved. Deadline for draft presentation to be uploaded to Gitlab is Monday, 25th March. Final presentation deadline is Monday, 8th April.
Testing scenarios
---
N/A
Technical notes
---
N/AHugh PatrickRikesh ChauhanHugh Patrickhttps://community.opengroup.org/osdu/platform/system/project-and-workflow/-/issues/79F1 (Java) Story 19: Cover Project API with integration tests2024-03-29T11:23:01ZDmitrii Novikov (EPAM)F1 (Java) Story 19: Cover Project API with integration testsDescription
---
Cover story with integration tests:
F1 (Java) Story 6: Implement /v1/projects CRUD API for collaboration projects #46
Acceptance criteria
---
All tests passed
Testing scenarios
---
Tests cover API layer
- /project
Te...Description
---
Cover story with integration tests:
F1 (Java) Story 6: Implement /v1/projects CRUD API for collaboration projects #46
Acceptance criteria
---
All tests passed
Testing scenarios
---
Tests cover API layer
- /project
Technical notes
---
Follow best practicesPavel BarzouPavel Barzouhttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/349CSP Support - IBM GCZ Deployment2024-03-18T15:00:55ZLevi RemingtonCSP Support - IBM GCZ DeploymentAnkita SrivastavaAnkita Srivastavahttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/348CSP Support - Cognizant GCZ Deployment2024-03-12T10:49:28ZLevi RemingtonCSP Support - Cognizant GCZ DeploymentMevin Mathew [Cognizant] reached out to GCZ team on February 28th, 2024 to request support for GCZ Installation. Ankita will be evaluating initial support requirements.Mevin Mathew [Cognizant] reached out to GCZ team on February 28th, 2024 to request support for GCZ Installation. Ankita will be evaluating initial support requirements.Ankita SrivastavaAnkita Srivastavahttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/347Data - Load Trajectory Parquet files into OSDU2024-03-11T20:33:15ZLevi RemingtonData - Load Trajectory Parquet files into OSDUAs a GCZ Data Loader, I want to prepare NZ Trajectory Parquet files and load them into the Azure GLAB OSDU instance so that they may be ingested by the GCZ Transformer.
### Data Stats
* ~400 Trajectory files - totaling 5mb.
* ~2700 Log ...As a GCZ Data Loader, I want to prepare NZ Trajectory Parquet files and load them into the Azure GLAB OSDU instance so that they may be ingested by the GCZ Transformer.
### Data Stats
* ~400 Trajectory files - totaling 5mb.
* ~2700 Log files - totaling 3.08GB
### Steps
Prerequisite: Generate ~6 SeismicAcquisitionSurvey and LiveTrace records from SegY
1. Identify whether trajectories can be loaded easily
2. List a small set of Trajectories + Logs + related Wells, and send to Brian so that we can request SLB load these parquet files into GLAB.
Do we have confirmation to load this size of data into GLAB?Michael WilhiteMichael Wilhitehttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/346Data - Load Horizon/Interpretation extent polygons into GLAB2024-03-12T14:37:03ZLevi RemingtonData - Load Horizon/Interpretation extent polygons into GLABAs a GCZ Developer, I require access to Horizon/Interpretation Extent Polygons in the GLAB OSDU instance. However, loading attempts have resulted in error due to issues with the GLAB's CRSTransformation service, which prevent transformin...As a GCZ Developer, I require access to Horizon/Interpretation Extent Polygons in the GLAB OSDU instance. However, loading attempts have resulted in error due to issues with the GLAB's CRSTransformation service, which prevent transforming points into WGS84.
Because the same data loading techniques are successful in the Azure Preship environment, but are failing in GLAB, it indicates an issue with the GLAB environment that should first be addressed by the GLAB team.Valentin GauthierMichael WilhiteLevi RemingtonValentin Gauthierhttps://community.opengroup.org/osdu/platform/system/project-and-workflow/-/issues/54F4: Read data from SoR or collaboration namespace2024-03-12T15:03:05ZMateusz RuszczykF4: Read data from SoR or collaboration namespaceDescription
---
The second part of notebook deals with apps interacting with the project so we need to be able to read data from SoR or WIP namespace.
Acceptance criteria
---
Jupyter notebook is able to successfully retrieve necessary...Description
---
The second part of notebook deals with apps interacting with the project so we need to be able to read data from SoR or WIP namespace.
Acceptance criteria
---
Jupyter notebook is able to successfully retrieve necessary data from DP
Testing scenarios
---
* The instructions in Notebook are easy to follow
* We are able to read SoR data
* We are able to read WIP data
Technical notes
---
Used services: Search, Storage, DatasetMateusz RuszczykMateusz Ruszczykhttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/340Ambassador - Service that generates a GCZ Transformer Configuration2024-03-26T16:02:56ZLevi RemingtonAmbassador - Service that generates a GCZ Transformer ConfigurationAs a GCZ Developer, I want to develop a service which automates the construction of a GCZ Transformer's configuration (`application.yml`).
This will improve GCZ deployment accessibility and contribute to AdminUI later on.
### Acceptanc...As a GCZ Developer, I want to develop a service which automates the construction of a GCZ Transformer's configuration (`application.yml`).
This will improve GCZ deployment accessibility and contribute to AdminUI later on.
### Acceptance Criteria
* Ambassador Service developed which dynamically generates an `application.yml` based on a configured OSDU instance
* Endpoint documented.
* Included in Postman CollectionAnkita SrivastavaAnkita Srivastavahttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/333Documentation - Captures AGO/Enterprise Compatibility and Configuration Steps2024-03-11T23:22:02ZLevi RemingtonDocumentation - Captures AGO/Enterprise Compatibility and Configuration StepsIncluding potential pattern for proxy setup, linking video.
Proxy setup had prerequisites IIS Extensions. Available for free but need to be enabled.Including potential pattern for proxy setup, linking video.
Proxy setup had prerequisites IIS Extensions. Available for free but need to be enabled.David JacobDavid Jacobhttps://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/issues/139[ADR] API end point to get members count for a particular group2024-01-17T15:33:57ZOm Prakash Gupta[ADR] API end point to get members count for a particular group## Status
* [x] Proposed
* [x] Trialing
* [x] Under review
* [x] Approved
* [ ] Retired
**Context & Scope**
This ADR is about getting the members count of a particular group. Though we can count the number of members once we get a lis...## Status
* [x] Proposed
* [x] Trialing
* [x] Under review
* [x] Approved
* [ ] Retired
**Context & Scope**
This ADR is about getting the members count of a particular group. Though we can count the number of members once we get a list of members We still need GetCount api because we don't want UI to count the number for so many groups all the time. It will result in a sluggish experience.
```plaintext
GET https://{{HOST}}/api/entitlements/v2/groups/:group_email/membersCount
Path variable - group_email.
Header - partition ID
Optional Parameter - Role
response
{
"groupEmail": "abc@xyx.xom",
"membersCount": 12345
}
The requester must be a member of group service.entitlements.user. All response codes would be as per the current standard in OSDU
validation - validate single partition ID and validate group email belonging to partition.
```
**Trade-off Analysis**
This endpoint would directly respond with the members/owners count for a particular group. currently, it requires additional coding on the UI side to count from the list and then put a total.
**Decision**
Just implement a simple API endpoint that would return members/owners count along with group email to identify the group.
**Definition of Done:**
1. API exists and member count can be retrieved by a member of the group as well as member of entitlements users.
1. Additional optional parameter, Role should the count returned. Member/Owner or total count.
2. If either of the membership is not present, it should give 401.
2. The access validations should be consistent with List Member API.
3. Unit tests are added for \~100% code coverage for the new code added
4. Integration tests are added and are running in the pipeline
5. Pipeline is running/pass with the same state as that of master.
6. Tutorial section update here https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/docs/tutorial/Entitlements-Service.md?ref_type=heads
7. API doc update here https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/docs/api/entitlements_openapi.yaml?ref_type=headsM23 - Release 0.26Deepa KumariOm Prakash GuptaDeepa Kumarihttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/135ADR Provide suggestions for auto-complete of input2024-01-15T11:56:08ZMark ChanceADR Provide suggestions for auto-complete of input# ADR: Autocomplete
<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [x] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Shell application developer stakeholders want to provide to their users the functi...# ADR: Autocomplete
<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [x] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Shell application developer stakeholders want to provide to their users the functionality to provide auto-complete suggestions based on partial input.
# Context & Scope
Based on words occurring in OSDU platform records, a comparison is made to all text tokens occurring in all fields of a record. For this case we propose using bagOfWords described in indexer [ADR](https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/113)
[Back to TOC](#TOC)
## Requirements
The partial input is passed to the search service and a list of suggestions is returned.
To be useful, the response time must be under 2 seconds.
[Back to TOC](#TOC)
# Tradeoff Analysis
[Back to TOC](#TOC)
# Proposed solution
The search query json will support this syntax:
```json
{
"suggestPhrase": "united"
}
```
Which would return something of the form:
```json
{
"phraseSuggestions": [
"United States",
"United States therm",
"United Kingdom",
"United Kingdom British thermal unit",
"United Kingdom term",
"United Kingdom nautical mile",
]
}
```
[Back to TOC](#TOC)
# Change Management
* Operators may need to execute reindex with force_clean=true action on indices to enable this feature.
# Decision
# Consequences
* The search code changes will not impact any existing queries or functionality since this is a new field.
[Back to TOC](#TOC)
#EOF.M23 - Release 0.26Mark ChanceMark Chancehttps://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/104Azure Monitor - Policy service logs not found in Azure App Insights2024-01-03T15:29:57ZKelly ZhouAzure Monitor - Policy service logs not found in Azure App InsightsHi,
We found that we can't find any policy service logs in Azure App Insight, is that by design or are we missing any configuration? we wonder if the monitoring of policy service in Azure deployment going well and how OSDU community ma...Hi,
We found that we can't find any policy service logs in Azure App Insight, is that by design or are we missing any configuration? we wonder if the monitoring of policy service in Azure deployment going well and how OSDU community managed it. Any response will be much appreciated.
@Srinivasan_Narayanan @nursheikh
Thank you!Shane HutchinsShane Hutchinshttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/276Event Tracking/Timelines2024-02-13T16:28:15ZNoel OkanyaEvent Tracking/TimelinesAs a GCZ Product Owner, I want to prepare for the below events, so that we can provide the stakeholders with GCZ updates:
1. EAGE Digital in March 24, 2024 (no OSDU topic)
2. ERGIS event in April 24th - 25th, 2024 - Esri
3. OSDU F2F in E...As a GCZ Product Owner, I want to prepare for the below events, so that we can provide the stakeholders with GCZ updates:
1. EAGE Digital in March 24, 2024 (no OSDU topic)
2. ERGIS event in April 24th - 25th, 2024 - Esri
3. OSDU F2F in Europe April 24th week (No action from the team)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/issues/73ADR: Worker Service for Wellbore Bulk Data Access2024-01-10T20:12:24ZKin Jin NgADR: Worker Service for Wellbore Bulk Data Access## Status
- [X] Proposed
- [X] Trialing
- [X] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
Currently, as of M16, Wellbore DDMS is experiencing performance challenges involving WellLogs operations with large bulk data (>...## Status
- [X] Proposed
- [X] Trialing
- [X] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
Currently, as of M16, Wellbore DDMS is experiencing performance challenges involving WellLogs operations with large bulk data (>1 Gb), especially on data reading.
It was also observed that Wellbore DDMS requires a significant amount of memory in comparison to the amount of data manipulated to serve incoming requests.
See issues #21 and #27.
Wellbore DDMS is composed of a general main service, which is responsible for handling both client facing API requests, and data access operations to underlying bulk data store.
In turn, the bulk data management implementation in WDDMS is highly based on [Dask](https://www.dask.org/).
For instance, for a large WellLog dataset stored in Wellbore DDMS, the associated data will not be be located in a individual parquet file, but rather distributed in several distinct parquet files.
When a request to retrieve the bulk data associated to a specific subset of WellLog curves, including or not the optional reference range,
is received, Dask is used to process all required parquet files, across which the queried data is stored,
and extract the cropped data corresponding to the selected curves and range from the WellLog dataset.
All operations in the described workflow are executed end to end in the same container for a given request.
Though the main service approach and Dask capabilities provide a simple and straighforward deployment,
it was identified, from previous analysis, that such pairing poses considerable limitations on
Wellbore DDMS performance and scalability capacity.
## Trade-off Analysis
Standard Python framework already offers a good support for I/O bound operations (see [asyncio](https://docs.python.org/3/library/asyncio.html)),
however, when it becomes more complex to deal with CPU bound operations and data transformation operations, Dask brings a first answer to that.
For instance, when reading and writing large WellLog datasets, Dask provides a concise and straighforward implementation to reconciliate data from multiple parquet files.
Nevertheless, if Dask appears to be a good solution for heavy computation, in most WDDMS' supported scenarios of data queries/filters,
Wellbore DDMS is primarily constraint by I/O operations rather than by data transformation operations.
Additionally, Dask showed not to be efficient when handling several queries involving smaller amounts of data,
as its minimum required memory footprint does not scale down based on the smaller volumes of data.
Dask cluster is implemented as a process based local cluster, which also brings several issues:
- Dask workers are internal to the pods and therefore cannot be shared with other WDDMS service instances.
- The scaling/resources request are indirectly done through WDDMS, not the Dask workers.
- Dask workers are actually process forks of WDDMS which leads to unnecessary memory usage even at startup or when idle.
Finally, we spotted several memory leaks within Dask and there are [several memory managment related issues open in Dask's GitHub](
https://github.com/dask/distributed/issues?q=is%3Aissue+is%3Aopen+label%3Amemory+).
## Decision
Dask remains a great tools but it does not fit the needs of WDDMS. Therefore Dask will be removed and
replaced by a new dedicated service responsible for bulk data access only called _wddms bulk data worker service_.
_wddms bulk data worker service_ will be specialized in bulk I/O and bulk data manipulation (transformation, filtering), while WDDMS main service will keep all domain knowledge/responsibility such as meta data manipulation or consistency rules for instance, but
will delegate bulk data operations to the _wddms bulk worker service_.
_wddms bulk worker service_ will not use Dask at all. This means the [current bulk data acces layer](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/wellbore/wellbore-domain-services/-/tree/master/app/bulk_persistence)
in WDDMS will not be moved as-is into the new dedicated service but reworked and tailored to WDDMS specific needs.
The image below illustrates side by side how scaling and workload distribution occur in the current and the target designs.
In the current implementation, an incoming request to retrieve a large amount of data will be limited to the Dask workers resources
of a single WDDMS pod though Dask workers from other WDDMS instances might be available.
In the target design, unlike the current architecture, all processing capacity of the _wddms bulk workers_ instances is available to be used by any WDDMS instances. That arrangement unlocks a better scaling capability as it is done directly on bulk data workers upon needs.
![scaling_view_worker_next](/uploads/921e4f3f506570bafabf38a917dbc3c7/scaling_view_worker_next.jpg){: width="60%"}
### Security Implications
In the current design, the authorization (ACL/policy) checks and the bulk data access operations in WDDMS are performed in the same service instance. Bulk data will only be served to valid users entitled to access the associated work product component record.
The changes proposed in this ADR separate the data access control layer, located in the main WDDMS service, from the bulk data access itself, located in the new _wddm bulk worker service_. See below, the changes in the communication patterns in the current vs target design diagrams.
Allowing users or other services to directly access _wddms bulk woker service_ endpoint would permit bypassing the data access control checks in the main WDDMS service.
Therefore, with the new topology, additional deployment configuration settings will be required to preserve a compliant and secure data access control in WDDMS,
- _wddms bulk woker service_ must not be accessible from the external network
- _wddms bulk woker service_ will only accept requests from WDDMS main service instances
#### Current
![threat_model_current](/uploads/8ef5bf06976e23ad45d2a243064c3e8c/threat_model_current.jpg){: width="60%"}
#### Target
![threat_model_target](/uploads/de4da75c833d44e8503eba6647d0ec98/threat_model_target.jpg){: width="60%"}Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/system/home/-/issues/103ADR: Persisting / Querying Status messages2024-03-26T11:25:30Zdevesh bajpaiADR: Persisting / Querying Status messages# Decision Title
Persisting / Querying Status messages to do a post query or analysis on status of a workflow that is either executing or was executed in past.
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
-...# Decision Title
Persisting / Querying Status messages to do a post query or analysis on status of a workflow that is either executing or was executed in past.
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Overview
With ADR#80 only real time updates of status was possible via subscribing to events. However, there is no mechanism to do a post query or analysis on status of a workflow that is either executing or was executed in past. This is the problem statement the current ADR is trying to address, and this can be done by persisting the status events and exposing a type safe API on top of this persisted content.
## Context & Scope
ADR#80 talks about Global Status Monitoring framework that provide a mechanism to track status of data journey/dataflows on the data platform. More details can be found in here https://community.opengroup.org/osdu/platform/system/home/-/issues/80#
Data journey/Dataflows - A typical Dataflow can be expressed as shown
![Untitled_Diagram.drawio](/uploads/d524fc5901b602ff2e19bcfa45478124/Untitled_Diagram.drawio.png)
Dataflow could have millions of records spread across multiple datasets to ingest in Data Platform. Consumer of this status model should have way to decide
whether dataflow has finished or not
if it is Successful or Failed
if failed then why? what's the reason?
at what stage it is in currently?
The most important aspect of that ADR#80 was agreeing on the contract of status message. To ensure that every OSDU service emitting a status message abides by the contract and there is a standard data model. Please view raised MR for the same.
This MR defines two type of GSM status messages.
DataSet Details - Dataset pertains to any data, such as file, collection of files, etc. DatasetId is a metadata record id that is returned in the response by the Metadata API while the metadata record is created. User can use Dataset Id to find correlation id of workflows initiated and track their progress using status detail messages.
Status Details - Holds the status of multiple stages in a dataflow initiated.
As the decision of ADR it was agreed that every OSDU service will publish their status to message queue, against a CorrelationId. Consumers can simply subscribe to that message queue or notification service to get events of status change for that CorrelationId. All our OSDU services are making use of CorrelationId and propagate the same for further REST calls. We leverage CorrelationId for tying all related status changed notifications.
In the same ADR it was also mentioned that, Status data processor service can be built to listen to these status changed events and put into persistent store to make them accessible for future references using querying capabilities.
## Solution
This ADR talks about contributing two services to OSDU community.
[Status Collector Service](https://community.opengroup.org/osdu/platform/system/status-collector)
Status Collector is an internal service (not exposed for external calls) and reacts to status messages that are published to status message queue. It picks up messages published to status message queue from every stage(OSDU Service) of the process and normalizes them to store in persistent storage for future reference.
[Status Processor Service](https://community.opengroup.org/osdu/platform/system/status-processor)
Status Processor Service provides APIs that allow users to query persisted status with multiple filters like correlationId, recordId, stage and status.
![collector-Page-2.drawio](/uploads/cb271b507c45834220b9d7ba32fb43ab/collector-Page-2.drawio.png)
Status Processor endpoints
The Status Processor service will support the operations listed below via different endpoints:
Query Dataset details
![image-2023-5-28_17-10-5](/uploads/a4adf3a935c1c1439bafb5f51958fab8/image-2023-5-28_17-10-5.png)
![image-2023-5-28_17-10-53](/uploads/fe17ce5edcaf27ac61456b984bfeb1bf/image-2023-5-28_17-10-53.png)
Query Status
![worddavf79f5c218b10163e8f8a12f5087a2ac1](/uploads/fd3fd8e4e608fcff39d53354a16933ef/worddavf79f5c218b10163e8f8a12f5087a2ac1.png)
![worddava18d304169b02aa8fca8394959add2f3](/uploads/ead6c6245f9bbd047f462d789c58190d/worddava18d304169b02aa8fca8394959add2f3.png)
## Decision
Implement services to persist and query GSM messages.
## Rationale
At present only real time updates of status was possible via subscribing to events. However, there is no mechanism to do a post query or analysis on status of a workflow that is either executing or was executed in past.
## Consequences
## When to revisit
---
# Tradeoff Analysis - Input to decision
## Alternatives and implications
## Decision criteria and tradeoffs
## Decision timelinedevesh bajpaidevesh bajpaihttps://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/96ADR: Make OPA configuration dynamic updatable2024-02-26T16:37:13ZShane HutchinsADR: Make OPA configuration dynamic updatable## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [x] Approved
- [ ] Retired
## Context
OSDU has adopted Rego as the language to define policies and [Open Policy Agent](https://www.openpolicyagent.org/docs/latest/) as an int...## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [x] Approved
- [ ] Retired
## Context
OSDU has adopted Rego as the language to define policies and [Open Policy Agent](https://www.openpolicyagent.org/docs/latest/) as an internal solution to manage and enforce the policies. To enforce a policy, various OSDU services call policy service which internally calls OPA API. Some services (storage) bypass policy service and make low level calls to OPA directly.
Today OPA configuration is strictly managed by CSPs, generally with a [kubernetes config map](https://kubernetes.io/docs/concepts/configuration/configmap/). By having this static and only updatable with backend it breaks the ability to add a partition with [partition](https://community.opengroup.org/osdu/platform/system/partition) create API.
As a result, once a new partition is the following services are become impacted:
- Storage
- Search
Any services that depends on the above, including but not limited to:
- Indexer
- [seismic-dms-suite seismic-store-service v4](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/tree/master/app/sdms-v4)
For additional context see the following issues and links:
- https://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/issues/94
- [Support Multi Partition Policies in OPA](https://community.opengroup.org/osdu/platform/security-and-compliance/policy/-/wikis/Support-Multi-Partition-Policies-in-OPA)
The workaround:
- Workaround requires backend access and manual updates for updating the OPA configuration. See [workaround](https://osdu.pages.opengroup.org/platform/security-and-compliance/policy/bundles/#adding-a-new-partition-to-osdu)
## Scope
Implement APIs to manage OPA configuration.
## Solution
Update the Policy Service /bootstrap API to also create, update and manage the configmap for OPA.
![image](/uploads/d7b7a0791ef1afb1897a067abdc0996f/image.png)
## Consequences
- Kubernetes permissions to allow read and update of OPA config map (opa-agent) will be required.
- CSPs will need to not update the config map once created.
## Futures
- At a later date partition service could be configured to call policy bootstrap API to remove the burden of having to call an additional API.M23 - Release 0.26Shane HutchinsShane Hutchinshttps://community.opengroup.org/osdu/platform/consumption/geospatial/-/issues/254Data - Load SLB New Zealand data into OSDU2024-03-12T05:04:37ZJoel RomeroData - Load SLB New Zealand data into OSDUAs a GCZ Product Owner, I want SLB New Zealand test data set to be loaded, so that the data is available for GCZ to develop with.
Note:
- Suggest Data Manager to load the data on IBM pre-ship (OSDU instance)
- Reach out to Operators fo...As a GCZ Product Owner, I want SLB New Zealand test data set to be loaded, so that the data is available for GCZ to develop with.
Note:
- Suggest Data Manager to load the data on IBM pre-ship (OSDU instance)
- Reach out to Operators for Data Managers that have this expertise
Acceptance Criteria:
- Data loaded to IBM pre-ship
- GCZ ran on loaded data and possible issues captured in the backlogMichael WilhiteMichael Wilhite