OSDU Software issueshttps://community.opengroup.org/groups/osdu/-/issues2021-11-17T17:08:56Zhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/213Ariflow 2.0 Performance Improvements2021-11-17T17:08:56ZKishore BattulaAriflow 2.0 Performance Improvements**Topic:**: `Airflow 2.0 Performance Improvements`
**Tasks**
- [ ] Airflow to support 10000 parallel DAG runs at any point in time
- [ ] Airflow autoscaling shouldn't disrupt running workflows as part of scale in.
- [ ] Airflow to suppo...**Topic:**: `Airflow 2.0 Performance Improvements`
**Tasks**
- [ ] Airflow to support 10000 parallel DAG runs at any point in time
- [ ] Airflow autoscaling shouldn't disrupt running workflows as part of scale in.
- [ ] Airflow to support 8M queuing capacity
- [ ] Documentation with necessary configuration to run above mentioned performancehttps://community.opengroup.org/osdu/platform/pre-shipping/-/issues/128Azure - 3 issues with pre-ship environment2022-08-23T11:28:53ZKamlesh TodaiAzure - 3 issues with pre-ship environmentIssues with Azure pre-ship environment.
1) Azure is now using Entitlements v2 and because of this, the requests in the postman collection (from the Platform Validation project) need to be modified.
When Entitlements v1 was being used...Issues with Azure pre-ship environment.
1) Azure is now using Entitlements v2 and because of this, the requests in the postman collection (from the Platform Validation project) need to be modified.
When Entitlements v1 was being used, the ACL being formed for **Azure had group name owner and viewer instead of owners and viewers (singular vs plural)**. **So special logic was added** while creating the ACL e.g.
cp = pm.environment.get("cloud_platform");
if (cp == "Azure") {
pm.environment.set("New_OwnerDataGroup", "data.default.owner");
pm.environment.set("New_ViewerDataGroup", "data.default.viewer");
}
else{
pm.environment.set("New_OwnerDataGroup", "data.default.owners");
pm.environment.set("New_ViewerDataGroup", "data.default.viewers");
}
Now since Entitlements v2 is being used the special logic is not needed. The above logic just needs to be replaced with
pm.environment.set("New_OwnerDataGroup", "data.default.owners");
pm.environment.set("New_ViewerDataGroup", "data.default.viewers");
2) **The second issue is with Schema API**.
In the previous version and still for other CSP at this point in time the following request made was successful
GET https://{{SCHEMA_HOST}}/api/schema-service/v1/schema/{{data-partition-id}}:wks:master-data--Well:1.0.0
wherein Azure pre-ship environment is failing. It requires the request to be modified as
GET ttps://{{SCHEMA_HOST}}/api/schema-service/v1/**schema?id=**{{data-partition-id}}:wks:master-data--Well:1.0.0
3) **The issue with manifest ingestion**.
The Manifest ingestion is failing (Osdu_ingest), it is not able to find the schema definition (**Do not know whether is it because of issue 2**)
Airflow Log:
[2021-11-18 17:40:32,490] {taskinstance.py:901} INFO - Executing <Task(ValidateManifestSchemaOperator): validate_manifest_schema_task> on 2021-11-18T17:40:18.417189+00:00
[2021-11-18 17:40:32,492] {standard_task_runner.py:54} INFO - Started process 936 to run task
[2021-11-18 17:40:32,514] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'Osdu_ingest', 'validate_manifest_schema_task', '2021-11-18T17:40:18.417189+00:00', '--job_id', '43250', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/osdu-ingest-r3.py', '--cfg_path', '/tmp/tmpb80lgskv']
[2021-11-18 17:40:32,515] {standard_task_runner.py:78} INFO - Job 43250: Subtask validate_manifest_schema_task
[2021-11-18 17:40:32,516] {cli_action_loggers.py:68} DEBUG - Calling callbacks: [<function default_action_log at 0x7fb91bcae730>]
[2021-11-18 17:40:32,548] {settings.py:233} DEBUG - Setting up DB connection pool (PID 936)
[2021-11-18 17:40:32,548] {settings.py:241} DEBUG - settings.configure_orm(): Using NullPool
[2021-11-18 17:40:32,592] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: Osdu_ingest.validate_manifest_schema_task 2021-11-18T17:40:18.417189+00:00 [running]> airflow-worker-3.airflow-worker.airflow.svc.cluster.local
[2021-11-18 17:40:32,646] {__init__.py:101} DEBUG - Preparing lineage inlets and outlets
[2021-11-18 17:40:32,646] {__init__.py:137} DEBUG - inlets: [], outlets: []
[2021-11-18 17:40:32,745] {validate_manifest_schema.py:74} DEBUG - Manifest data: {'ReferenceData': [{'data': {'AttributionAuthority': 'OSDU', 'Descrption': 'Auto Test', 'Code': 'SpudCode_999207272833', 'Source': 'Auto Test Published/FacilityEventType.1.0.0.xlsx; commit SHA 38615b34.', 'Name': 'Spud_999207272833'}, 'kind': 'opendes:wks:reference-data--FacilityEventType:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:reference-data--FacilityEventType:SPUD_DATE_999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}, {'data': {'AttributionAuthority': 'OSDU', 'Descrption': 'Auto Test', 'Code': 'DEPTH_DATUM_ELEVCode_999207272833', 'Source': 'Auto test Published/VerticalMeasurementPath.1.0.0.xlsx; commit SHA 38615b34.', 'Name': 'DEPTH_DATUM_ELEV_999207272833'}, 'kind': 'opendes:wks:reference-data--VerticalMeasurementPath:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:reference-data--VerticalMeasurementPath:DEPTH_DATUM_ELEV_999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}, {'data': {'AttributionAuthority': 'OSDU', 'Descrption': 'Auto Test', 'Code': 'AliasForWell_999207272833', 'Source': 'Auto Test Published/FacilityEventType.1.0.0.xlsx; commit SHA 38615b34.', 'Name': 'Alias_Auto_Test_999207272833'}, 'kind': 'opendes:wks:reference-data--AliasNameType:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:reference-data--AliasNameType:WELL_NAME_999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}, {'data': {'AttributionAuthority': 'OSDU', 'Descrption': 'Auto Test', 'Code': 'FacilityNameForWell_999207272833', 'Source': 'Auto Test Published/FacilityEventType.1.0.0.xlsx; commit SHA 38615b34.', 'Name': 'FacilityName_Auto_Test_999207272833'}, 'kind': 'opendes:wks:reference-data--FacilityType:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:reference-data--FacilityType:WELL_999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}], 'MasterData': [{'data': {'OrganisationName': 'Auto_Test_999207272833', 'Source': 'AUTO test'}, 'kind': 'opendes:wks:master-data--Organisation:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:master-data--Organisation:Auto_Test_999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}, {'data': {'FacilityID': 'FaciltyIdAutoTest_999207272833', 'NameAliases': [], 'SpatialLocation': {'Wgs84Coordinates': {'features': [{'geometry': {'coordinates': [3.51906683, 55.68101428], 'type': 'Point'}, 'type': 'Feature', 'properties': {}}], 'type': 'FeatureCollection'}}, 'FacilityNameAlias': [{'AliasName': 'Alias_Auto_Test_999207272833', 'AliasNameTypeID': 'opendes:reference-data--AliasNameType:WELL_NAME_999207272833:'}], 'VerticalMeasurements': [{'VerticalMeasurementID': 'Kelly Bushing', 'VerticalMeasurementPathID': 'opendes:reference-data--VerticalMeasurementPath:DEPTH_DATUM_ELEV_999207272833:', 'VerticalMeasurement': 36.6}], 'GeoContexts': [], 'FacilityEvent': [{'EffectiveDateTime': '1999-06-03T00:00:00', 'FacilityEventTypeID': 'opendes:reference-data--FacilityEventType:SPUD_DATE_999207272833:'}], 'FacilityOperator': [{'FacilityOperatorID': 'FacilityOperatorIdAutoTest_999207272833', 'FacilityOperatorOrganisationID': 'opendes:master-data--Organisation:Auto_Test_999207272833:'}], 'Source': 'AUTO test NL_TNO', 'FacilityName': 'FacilityNameAutoTest_999207272833', 'FacilityTypeID': 'opendes:reference-data--FacilityType:WELL_999207272833:'}, 'kind': 'opendes:wks:master-data--Well:1.0.0', 'legal': {'legaltags': ['opendes-Well-Legal-Tag-Test3460168'], 'otherRelevantDataCountries': ['US']}, 'id': 'opendes:master-data--Well:999207272833', 'acl': {'viewers': ['data.default.viewers@opendes.contoso.com'], 'owners': ['data.default.owners@opendes.contoso.com']}}], 'kind': 'opendes:wks:Manifest:1.0.0'}
[2021-11-18 17:40:32,790] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): schema.osdu-azure.svc.cluster.local:80
[2021-11-18 17:40:33,173] {connectionpool.py:442} DEBUG - http://schema.osdu-azure.svc.cluster.local:80 "GET /api/schema-service/v1/schema/opendes:wks:Manifest:1.0.0 HTTP/1.1" 404 None
**[2021-11-18 17:40:33,175] {authorization.py:137} ERROR - {"error":{"code":404,"message":"Schema is not present","errors":**[{"domain":"global","reason":"notFound","message":"Schema is not present"}]}}
[2021-11-18 17:40:34,179] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): schema.osdu-azure.svc.cluster.local:80
[2021-11-18 17:40:34,574] {connectionpool.py:442} DEBUG - http://schema.osdu-azure.svc.cluster.local:80 "GET /api/schema-service/v1/schema/opendes:wks:Manifest:1.0.0 HTTP/1.1" 404 None
[2021-11-18 17:40:34,575] {authorization.py:137} ERROR - {"error":{"code":404,"message":"Schema is not present","errors":[{"domain":"global","reason":"notFound","message":"Schema is not present"}]}}
[2021-11-18 17:40:35,580] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): schema.osdu-azure.svc.cluster.local:80
[2021-11-18 17:40:35,830] {connectionpool.py:442} DEBUG - http://schema.osdu-azure.svc.cluster.local:80 "GET /api/schema-service/v1/schema/opendes:wks:Manifest:1.0.0 HTTP/1.1" 404 None
[2021-11-18 17:40:35,831] {authorization.py:137} ERROR - {"error":{"code":404,"message":"Schema is not present","errors":[{"domain":"global","reason":"notFound","message":"Schema is not present"}]}}
[2021-11-18 17:40:35,831] {validate_schema.py:175} ERROR - Error on getting schema of kind 'opendes:wks:Manifest:1.0.0'
[2021-11-18 17:40:35,832] {validate_schema.py:176} ERROR - 404 Client Error: Not Found for url: http://schema.osdu-azure.svc.cluster.local/api/schema-service/v1/schema/opendes:wks:Manifest:1.0.0
[2021-11-18 17:40:35,832] {taskinstance.py:1150} ERROR - There is no schema for Manifest kind opendes:wks:Manifest:1.0.0MANISH KUMARMANISH KUMARhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/42[GCP] Seismic store doesn't use Partition Service to get a GCP project-id of ...2023-03-27T19:16:22ZYan Sushchynski (EPAM)[GCP] Seismic store doesn't use Partition Service to get a GCP project-id of Google Cloud ProjectThe main problems are following:
- See no signs that SSDMS uses Partition Service at all, it accepts requests with no data-partition-id header
- When we create SSDMS tenant, we have to specify `gcpid`, the project where data will be stor...The main problems are following:
- See no signs that SSDMS uses Partition Service at all, it accepts requests with no data-partition-id header
- When we create SSDMS tenant, we have to specify `gcpid`, the project where data will be stored if we use this tenant in our `sd-path`.
It causes two problems:
- users have to know the actual `gcpid`
- users can specify the `gcpid` that doesn’t correspond `data-partition-id`
Example of create tenant request:
```
{
"gcpid": "{{gcp_project_id}}",
"esd": "{{data-partition-id}}.osdu-gcp.go3-nrg.projects.epam.com",
"default_acl": "data.default.owners@{{data-partition-id}}.osdu-gcp.go3-nrg.projects.epam.com"
}
```
Solution is to use Partition Service to get GCP project-id, thus users don't need to specify `gcpid` manually and the GCP project-id is chosen correctly.
cc:
@Kateryna_Kurach @Siarhei_KhaletskiM13 - Release 0.16https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/44Add documentation - explain whether all fields from Document Store are replic...2022-09-29T13:41:06ZDebasis ChatterjeeAdd documentation - explain whether all fields from Document Store are replicated to Index StoreIt would be very useful to have a section providing some background on this subject.
There are times when we see a list of fields from Storage service (GET), but some fields missed from Search service (Query).
I recently experienced thi...It would be very useful to have a section providing some background on this subject.
There are times when we see a list of fields from Storage service (GET), but some fields missed from Search service (Query).
I recently experienced this when working with custom schema (CSV Ingestion test case).
**Recent thread **with @nthakur
> Hi Debasis,
>
> In general if there is no indexing errors, if a field is defined in Schema (Schema from Schema service for the kind) than Indexer will index it. Fields included in Storage service records may or may not have definition in Schema, so you may see lot more fields on Storage records. If there are errors, than index.trace block on Search service response for the record will tell you which properties were skipped over.
>
> https://community.opengroup.org/osdu/platform/system/indexer-service/-/blob/master/docs/tutorial/IndexerService.md#get-indexing-status
>
> Please let me know if you have any additional question.
>
> Regards,
> Neelesh
cc - @ChrisZhang , @ethiraj and @sehuboy for informationhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/csv-parser/csv-parser/-/issues/63Provide list of successfully created IDs in XCom summary from Airflow console2021-12-26T19:05:29ZDebasis ChatterjeeProvide list of successfully created IDs in XCom summary from Airflow consoleIn my test case, it shows me 3 records created successfully.
But it does not show these IDs in Xcom summary (Saved IDs). The grid is there, but it is empty.
![IBM-Airflow-Console-CSV-Xcom-summary](/uploads/1c7d350e673afab58cdf274f69dcd...In my test case, it shows me 3 records created successfully.
But it does not show these IDs in Xcom summary (Saved IDs). The grid is there, but it is empty.
![IBM-Airflow-Console-CSV-Xcom-summary](/uploads/1c7d350e673afab58cdf274f69dcd3ab/IBM-Airflow-Console-CSV-Xcom-summary.PNG)
Let me know if you need any supporting information.
Thank youhttps://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/core-external-data-workflow/-/issues/1ENI requirement - search from External Data source without persisting into lo...2022-01-10T17:52:52ZDebasis ChatterjeeENI requirement - search from External Data source without persisting into local Data PlatformSource - Marco Piantanida (ENI) @marco.piantanida
This sequence diagram describes our desired way of performing the search: the user is connected to OSDU, invokes the OSDU search syntax, but the search itself is delegated to the extern...Source - Marco Piantanida (ENI) @marco.piantanida
This sequence diagram describes our desired way of performing the search: the user is connected to OSDU, invokes the OSDU search syntax, but the search itself is delegated to the external data source through a wrapper that translated the OSDU query syntax into the specific query syntax of the proprietary data platform; then the search is performed on the fly by the proprietary data platform (therefore applying all the complex entitlement rules that have been implemented within the data platform), returns the results to the wrapper, which translates the result into a WKS of OSDU.
This is only a different behavior for the search mechanism compared to the one which is currently implemented in EDS. The data retrieval mechanism that is currently implemented in EDS fits our needs.
Enclosed is the actual email thread from Marco on 18-Nov-2021
[2021_11_18-Marco-ENI-EDS-requirement.docx](/uploads/4447ffb62b410fa362e6a9be5043fcbe/2021_11_18-Marco-ENI-EDS-requirement.docx)
cc - @AshishSaxenaAccenture and @jrougeau for informationhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/214Error Azure Infra setup on step "Deploy Monitoring Resources"2021-11-23T08:53:37ZSergey ZemskovError Azure Infra setup on step "Deploy Monitoring Resources"I have successfully completed all the required steps before:
- `common_prepare.sh` script has executed without errors and warnings
- `.envrc` file contain all necessary parameters
I get this error while execute deployment `terraform app...I have successfully completed all the required steps before:
- `common_prepare.sh` script has executed without errors and warnings
- `.envrc` file contain all necessary parameters
I get this error while execute deployment `terraform apply -var-file custom.tfvars`:
```
Error: Error creating or updating Scheduled Query Rule "airflow-import-errors-alert-osdu-mvp-mrdemo-e5sm" (resource group "osdu-mvp-mrdemo-e5sm-rg"): insights.ScheduledQueryRulesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="Scope 'osdu-mvp-crdemo-dz5-ai' does not exists"
on main.tf line 239, in resource "azurerm_monitor_scheduled_query_rules_alert" "alerts":
239: resource "azurerm_monitor_scheduled_query_rules_alert" "alerts" {
```https://community.opengroup.org/osdu/platform/deployment-and-operations/helm-charts-azure/-/issues/7Error installing Helm Chart for OSDU on Azure Airflow2021-11-24T08:30:02ZSergey ZemskovError installing Helm Chart for OSDU on Azure AirflowIt looks like commands in `yaml` is deprecated
After running this command `helm install airflow osdu-airflow -n $NAMESPACE -f osdu_airflow_custom_values.yaml` I get error:
```
W1123 16:50:24.601955 9797 warnings.go:70] rbac.authoriza...It looks like commands in `yaml` is deprecated
After running this command `helm install airflow osdu-airflow -n $NAMESPACE -f osdu_airflow_custom_values.yaml` I get error:
```
W1123 16:50:24.601955 9797 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role
W1123 16:50:24.769256 9797 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding
W1123 16:50:26.277249 9797 warnings.go:70] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W1123 16:50:28.803284 9797 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role
W1123 16:50:28.971781 9797 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding
W1123 16:50:29.706011 9797 warnings.go:70] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Error: failed post-install: timed out waiting for the condition
```https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/78One should be able to delete the created schema2022-05-05T08:30:39ZKamlesh TodaiOne should be able to delete the created schemaAt present, the API seems to be missing the delete functionality. Prior to schema service when storage API was used there was an option to delete the schema. But in Schema Service that option is not present. So when creating the custom s...At present, the API seems to be missing the delete functionality. Prior to schema service when storage API was used there was an option to delete the schema. But in Schema Service that option is not present. So when creating the custom schemas for testing, it is difficult to clean after the testhttps://community.opengroup.org/osdu/ui/data-loading/wellbore-ddms-data-loader/-/issues/29Integration strategy with osdu-cli2022-08-23T15:55:55ZChad LeongIntegration strategy with osdu-cliAs a user, I want to be able to use a single utility to load all my data (reference, master, welllogs, etc) . This single utility is presently available as (https://community.opengroup.org/osdu/platform/data-flow/data-loading/osdu-cli).
...As a user, I want to be able to use a single utility to load all my data (reference, master, welllogs, etc) . This single utility is presently available as (https://community.opengroup.org/osdu/platform/data-flow/data-loading/osdu-cli).
We want to be able to merge this project into the osdu-cli. What is the best way/least friction way to achieve this ?https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/79ADR: Registering Schema Extensions2022-08-31T15:33:29ZParesh BehedeADR: Registering Schema Extensions**Context & Scope**
Here we are proposing a new set of APIs on Schema service on OSDU that allow for the creation, update retrieval and deletion of extensions on existing Schemas.
This ADR carries on from the following ADR [here](https...**Context & Scope**
Here we are proposing a new set of APIs on Schema service on OSDU that allow for the creation, update retrieval and deletion of extensions on existing Schemas.
This ADR carries on from the following ADR [here](https://community.opengroup.org/osdu/platform/system/search-service/-/issues/69) to allow the x-osdu-virtual-property defined in schemas to also be applied as an extension to schemas by services and applications running on top of the OSDU.
**Trade-off Analysis**
An important assumption we make is that a provider of an extension is the only concrete consumer of that extension. We shall call the provider the schema extension authority. This authority defines the scope it can be governed at.
The alternative is to assign versioning onto extensions. However this then faces potential problems of schema bloat if many versions of an extension could exist for a schema and also increases the friction for adoption by client applications where the extensions are trying to give them the freedom and flexibility to use OSDU schemas in their own context.
Allowing virtual properties to be assigned as extensions services like indexer/search are loosely coupled with these extensions as they need to use them in a generic way e.g. to index an extension property but no consumer will have a hard coupling to the specific properties provided unless they choose to. This is important because it allows the extension authority to change their extensions when they need to and not worry about breaking changes outside their own scope.
This also gives the extension authority flexibility and not enforce the same versioning semantics on an extension that is in the schema.
**Decision**
A new set of APIs will be be developed in the schema service that allows clients to register their own extensions on top of existing schemas.
For now this will only allow users to add in their own _"x-osdu-virtual-properties"_ extensions to existing schemas. However theoretically it could be extended for any extension use case.
The extension APIs can append new virtual properties only, it cannot override or change existing virtual properties defined in the schema.
When clients retrieve schemas from the existing schema service the schema service appends the extensions into the schema transparently to the client.
For example imagine the schema osdu:wks:master-data--Well:1.0.0 contains the following virtual property definition.
```json
{
"x-osdu-virtual-properties":{
"data.VirtualProperties.DefaultLocation": {
"type": "object",
"priority": [
{ "path": "data.ProjectedBottomHoleLocation" },
{ "path": "data.GeographicBottomHoleLocation" },
{ "path": "data.SpatialLocation" }
]}
}
}
```
Then in the schema extensions we provide an extension like below on the new extensions _POST_ API
```json
"kind": "osdu:wks:master-data--Well:1.0.0",
"authority": "MyApplication"
"x-osdu-virtual-properties":{
"data.VirtualProperties.MyDefaultName": {
"type": "string",
"priority": [
{ "path": "data.FacilityName", "isType": "string" }
]}
}
```
A client requesting the schema _"osdu:wks:master-data--Well:1.0.0"_ would then get a result that contained the following.
```json
"schema" {
...
...
{
"x-osdu-virtual-properties":{
"data.VirtualProperties.DefaultLocation": {
"type": "object",
"priority": [
{ "path": "data.ProjectedBottomHoleLocation" },
{ "path": "data.GeographicBottomHoleLocation" },
{ "path": "data.SpatialLocation" }
]}
}
...
...
"x-osdu-extensions": {
"MyApplication": {
"x-osdu-virtual-properties": {
"data.VirtualProperties.MyDefaultName": {
"type": "string",
"priority": [{
"path": "data.FacilityName",
"isType": "string"
}]
}
}
}
}
}
```
Where an extensions object is added to the schema and grouped within the `authority`
However if someone tried to register the following to extensions
```json
"kind": osdu:wks:master-data--Well:1.0.0
"x-osdu-virtual-properties":{
"data.VirtualProperties.DefaultLocation": {
"type": "object",
"priority": [
{ "path": "data.ProjectedBottomHoleLocation" }
]}
}
```
It would fail as 'DefaultLocation' is already declared as a virtual property in the schema.
The storage of extensions supplied is always system wide. This means any extensions registered apply to all partitions. However they are scoped by the schema they are applied to.
**API Spec**
Below is the API spec for the schema extensions
```yaml
paths:
"/schema/extensions":
post:
tags:
- Schema
summary: Adds a schema extension to the schema repository.
description: Adds a schema extension to the schema repository. The extension is identified by a combination of the 'kind' and 'authority' properties assigned on the request and must be unique. Scope of an extension is always SHARED. Required roles 'users.datalake.editors' or 'users.datalake.admins' groups to create schema.
operationId: Create Schema extension
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/SchemaExtensionRequest"
required: true
responses:
"201":
description: "Schema extension created"
headers:
location:
description: "Path of newly created schema extension."
schema:
type: "string"
content:
application/json:
schema:
$ref: "#/components/schemas/SchemaExtensionResponse"
"400":
description: "Bad request"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"401":
description: "Unauthorized"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"403":
description: "Forbidden"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"409":
description: "Extension with the same 'kind' and 'authority' already created"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
security:
- bearer: []
- appkey: []
deprecated: false
put:
tags:
- Schema
summary: Adds a schema extension to the schema repository.
description: Updates a schema extension. The extensions is identified by a combination of the 'kind' and 'authority' properties assigned on the request and must be unique. Scope of an extension is always SHARED. Required roles 'users.datalake.editors' or 'users.datalake.admins' groups to create schema.
operationId: Update Schema extension
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/SchemaExtensionRequest"
required: true
responses:
"200":
description: "Schema extension updated"
headers:
location:
description: "Path of updated created schema extension."
schema:
type: "string"
content:
application/json:
schema:
$ref: "#/components/schemas/SchemaExtensionResponse"
"400":
description: "Bad request"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"401":
description: "Unauthorized"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"403":
description: "Forbidden"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
security:
- bearer: []
- appkey: []
deprecated: false
get:
tags:
- Schema
summary: "Gets schema extensions from the schema repository."
description: "Gets schema extensions. Required roles 'users.datalake.viewers' or 'users.datalake.editors' or 'users.datalake.admins' groups."
operationId: Gets Schema extensions
parameters:
- in: query
name: kind
schema:
type: string
required: false
description: "The kind to retrieve extensions for"
- in: query
name: authority
schema:
type: string
required: false
description: "The authority to retrieve extensions for"
- in: query
name: id
schema:
type: string
required: false
description: "The ID to retrieve the extension for"
responses:
"200":
description: "OK"
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/SchemaExtensionResponse"
"400":
description: "Bad request"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"401":
description: "Unauthorized"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"403":
description: "Forbidden"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
security:
- bearer: []
- appkey: []
deprecated: false
delete:
tags:
- Schema
summary: "Deletes a schema extensions"
description: "Deletes a schema extensions. Required roles 'users.datalake.admins' groups."
operationId: Deletes a schema extension
parameters:
- in: query
name: id
schema:
type: string
required: true
description: "The ID to delete the extension for"
responses:
"204":
description: "No content"
"400":
description: "Bad request"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"401":
description: "Unauthorized"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
"403":
description: "Forbidden"
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponseFormat'
security:
- bearer: []
- appkey: []
deprecated: false
schemas:
SchemaExtensionRequest:
type: object
title: SchemaExtensionRequest
properties:
kind:
type: string
example: "osdu:wks:master-data--Well:1.0.0"
description: "The schema kind(s) the extension applies to. Can use explicit 'authority','source','type','version' values for one or more parts."
authority:
type: string
example: "OSDU"
description: "The authority who supplied the extension. Can be used to identify who should use the extension. A authority can supply one or more extensions but only one for for each unique kind."
x-osdu-extensions:
type: object
example: {}
description: "The extensions to apply to the given kind"
example:
kind: "osdu:wks:master-data--Well:1.0.0"
authority: "GIS"
x-osdu-extensions:
x-osdu-virtual-properties:
data.VirtualProperties.DefaultPosition:
type: "object"
priority: [{ "path": "data.ProjectedBottomHoleLocation" },{"path": "data.GeographicBottomHoleLocation", "type": "GeoJson" },{"path": "data.SpatialLocation"} ]
SchemaExtensionResponse:
type: object
title: SchemaExtensionRequest
properties:
id:
type: string
description: "The unique identifier for the extension"
example:
id: "e4erg55677hfhrrbe5erveer4=="
kind: "osdu:wks:master-data--Well:1.0.0"
authority: "GIS"
x-osdu-extensions:
x-osdu-virtual-properties:
data.VirtualProperties.DefaultPosition:
type: "object"
priority: [{ "path": "data.ProjectedBottomHoleLocation" },{"path": "data.GeographicBottomHoleLocation", "type": "GeoJson" },{"path": "data.SpatialLocation"} ]
```
**Example Use case: Adding and consuming an extensions data property**
Up to this point we have looked at how we can map multiple properties to a virtual property to enable the use case of discovery across different kinds.
However another use case for schema extensions is for different consumers to add new properties into schemas and have them also become discoverable.
To enable this we can make use of the new ```x-osdu-virtual-properties```. For example when an application ingests data they include the properties from the OSDU schema that it relates. However they could choose to add their own properties as well that aren't mentioned in the schema.
This data is then kept in the storage records but is not indexed or known by other consumption services as it is not represented in the schema.
So if I ingest a record from my application like below
```
{
"id": "p1:wks:master-data--Well:1234",
"kind": "osdu:wks:master-data--Well:1.0.0",
...
...
"data": {
"Name": "1234-abc,
"ExtensionProperties": {
"petrel-project": "sim1.pet"
}
...
'''
}
}
```
The property ```Name``` is indexed as it is part of the referenced schema but the property ```petrel-project``` is not part of the schema so is not indexed.
However I could use the new ```x-osdu-virtual-properties``` to assign it as a virtual property so it is indexed.
```
"kind": "osdu:wks:master-data--Well:1.0.0",
"x-osdu-extensions" : {
"authority": "SLB",
"x-osdu-virtual-properties":{
"data.ExtensionProperties.petrel-project": {
"type": "object",
"priority": [
{
"path": "data.ExtensionProperties.petrel-project",
"isType":"string"
}
]}
}
}
```
As I am using the same virtual property key ```data.ExtensionProperties.petrel-project``` as the path to the property in the storage record this means it is indexed on the same path.
e.g. to search for this property
```
{
"kind": "osdu:wks:master-data--Well:1.0.0",
"query": "data.ExtensionProperties.petrel-project:'sim1.pet'"
}
```
**Consequences**
- Schema service needs to be extended to support creating, updating and deleting extensions
- Schema service needs to be updated to add an extensions object to schemas returned to clients
- Schema service needs to send notifications when extensions change
- Indexer needs to support schemas with extensions containing `"x-osdu-virtual-properties"`Paresh BehedeParesh Behedehttps://community.opengroup.org/osdu/ui/data-loading/wellbore-ddms-data-loader/-/issues/30Loading WITSML Log data into Wellbore DDMS2022-08-23T13:29:48ZChad LeongLoading WITSML Log data into Wellbore DDMS# Introduction
WITSML Well Log data is another form of logging data format apart from LAS that is developed by Energistics. We want to be able to load this data into the Wellbore DDMS.
# Objective
As a user, I want to be able to load ...# Introduction
WITSML Well Log data is another form of logging data format apart from LAS that is developed by Energistics. We want to be able to load this data into the Wellbore DDMS.
# Objective
As a user, I want to be able to load a [WITSML log data](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics-osdu-integration/-/blob/master/energistics/witsml_data/Log.xml) into the Wellbore DDMS.
- The data loading workflow for loading a WITSML should be identical to loading a LAS.
1. Parse WITSML - We can use integrate this [parser](https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics-osdu-integration/-/blob/master/energistics/src/witsml_parser/energistics/libs/energistics_parsers/witsml_2_0/witsml_2_0_xsd_log.py) available under Energistics.
2. Create Wellbore record if not exists
3. Create Welllog record
4. Write bulk data to Welllog record idhttps://community.opengroup.org/osdu/ui/data-loading/osdu-cli/-/issues/9Download files from WPC2021-11-25T13:21:12ZMark HewittDownload files from WPCStarting from a query for OSDU work product component WellLog and a specific Well ID, I would like to download the well logs for that well ID. Support for downloading from the file service is added - should also support downloading direc...Starting from a query for OSDU work product component WellLog and a specific Well ID, I would like to download the well logs for that well ID. Support for downloading from the file service is added - should also support downloading directly by specifying a WPC ID.https://community.opengroup.org/osdu/ui/data-loading/wellbore-ddms-data-loader/-/issues/31Retreiving WITSML Log data from Wellbore DDMS2022-08-23T10:50:58ZChad LeongRetreiving WITSML Log data from Wellbore DDMShttps://community.opengroup.org/osdu/ui/data-loading/wellbore-ddms-data-loader/-/issues/35Investigate WITSML parser capability2022-11-29T20:55:12ZChad LeongInvestigate WITSML parser capabilityInvestigate the extensibility of current WITSML parser vs a new parser for WITSML files that support Wellbore DDMS
The parser is here - https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics-osdu-integration/-/tre...Investigate the extensibility of current WITSML parser vs a new parser for WITSML files that support Wellbore DDMS
The parser is here - https://community.opengroup.org/osdu/platform/data-flow/ingestion/energistics-osdu-integration/-/tree/master/energisticsNiall McDaidNiall McDaidhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/13Ingest of multi-object, cloud-optimized formats into Seismic DDMS2023-03-30T16:52:43ZGregIngest of multi-object, cloud-optimized formats into Seismic DDMSIngest of multi-object, cloud-optimized formats into Seismic DDMS
a. The sdutil utility can be used to create a seismic dataset within an associated, already created seismic project, but only if the seismic dataset includes exactly one ...Ingest of multi-object, cloud-optimized formats into Seismic DDMS
a. The sdutil utility can be used to create a seismic dataset within an associated, already created seismic project, but only if the seismic dataset includes exactly one object. It isn’t currently possible to use sdutil to create a seismic dataset comprising an object-store optimized dataset--FileCollection.Bluware.OpenVDS:1.0.0, even if the FileCollection already exists in another object store location or on a local file system.
b. An existing ingest flow provided in the R3M7 release can be used to generate and create a seismic dataset that comprises a dataset--FileCollection.Bluware.OpenVDS:1.0.0, but only if:
i. The ingested source is in segy format
ii. The conversion from segy to OpenVDS S3 format occurs in a container hosted within the data platformChris ZhangChris Zhanghttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/14Sources other than segy require an alternate ingest flow2023-03-30T16:53:14ZGregSources other than segy require an alternate ingest flowSources other than segy require an alternate ingest flow. Conversion within data platform containers may lead to scalability issues when very large seismic datasets must be converted, or large numbers of volumes must be converted in para...Sources other than segy require an alternate ingest flow. Conversion within data platform containers may lead to scalability issues when very large seismic datasets must be converted, or large numbers of volumes must be converted in parallel, potentially also requiring an alternate ingest flow.
a. A seismic dataset that comprises a dataset--FileCollection.Bluware.OpenVDS:1.0.0 can be created by using the underlying Seismic DMS APIs, together with client-side use of Bluware libraries and utilities. This approach requires orchestration of the required steps by the client of the Seismic DMS, for which an orchestration utility is likely helpful.
b. The current data definition for dataset--FileCollection.Bluware.OpenVDS:1.0.0 provides for inclusion of each object within the FileCollection within the FileCollection’s metadata in the DatasetProperties.FileSourceInfos array. However this approach leads to scalability issues for large FileCollections, which could include tens or hundreds of millions of objects, resulting in resource constraints when attempting to serialize and/or parse a single json object that lists them all.
Related to #13.Chris ZhangChris Zhanghttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/15Consumption of seismic datasets2023-03-30T16:50:03ZGregConsumption of seismic datasetsAs described in this seismic store sdutil issue (#12), the sdutil utility can be used to retrieve a local copy of a seismic dataset, but only if the seismic dataset includes exactly one object. It isn’t currently possible to use sdutil ...As described in this seismic store sdutil issue (#12), the sdutil utility can be used to retrieve a local copy of a seismic dataset, but only if the seismic dataset includes exactly one object. It isn’t currently possible to use sdutil to retrieve a local copy of a seismic dataset that comprises an object-store optimized dataset--FileCollection.Bluware.OpenVDS:1.0.0.
A primary consumption use case for such object-store optimized seismic datasets is parallel, streaming-oriented access that avoids local copies of the dataset.
Related to #12.Chris ZhangChris Zhanghttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/46SegyImport and OpenVDS DAG2023-03-27T19:15:37ZGregSegyImport and OpenVDS DAGThe OpenVDS DAG should allow header parameters to be passed for the conversion which can override header information in the Segy. The DAG uses SegyImport which can accept header parameters (see http://osdu.pages.community.opengroup.org/p...The OpenVDS DAG should allow header parameters to be passed for the conversion which can override header information in the Segy. The DAG uses SegyImport which can accept header parameters (see http://osdu.pages.community.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/tools/SEGYImport/README.html ), however there is no mechanism to pass these header fields to the DAG.M10 - Release 0.13Chris ZhangChris Zhanghttps://community.opengroup.org/osdu/platform/system/storage/-/issues/100Storage API /query/kinds is broken and breaks reindex functionality2023-03-13T10:16:44ZGary MurphyStorage API /query/kinds is broken and breaks reindex functionality**_Takeaway_**<br/>
The /query/kinds API has been broken in OSDU Storage for quite a while, and fixing it was not a priority as Schema Service endpoints were thought to be the successor solution. This is not the case, and /query/kinds ...**_Takeaway_**<br/>
The /query/kinds API has been broken in OSDU Storage for quite a while, and fixing it was not a priority as Schema Service endpoints were thought to be the successor solution. This is not the case, and /query/kinds needs to work as designed.<br/>
**_Summary_**
The context here is the issue to change the Indexer to use Schema Service schemas instead of the original Storage Schemas (https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/7). This has been done successfully; however, the original plan to retire the Storage Schema endpoints + /query/kinds entirely exposed a hole in functionality that needs to be addressed. Essentially, it was thought that fixing /query/kinds would not be needed with the Schema Service, but the use cases where Storage is the source of truth for *in use* kinds were not caught.<br/><br/>
**Key Use Case** -- reindexing all kinds<br/><br/>
Reindexing all kinds in an Elasticsearch cluster (Reindex All) is an infrequent but vital operation. Cases where it is required include:
disaster recovery after Storage Records are restored, application of changes to Elasticsearch analyzers, and correction of indices after changes to base OSDU schemas or client schemas.<br/><br/>
Disaster Recovery Scenario:
1. All records in Storage (including underlying CosmosDB or FireStore or whatever) are brought back to RPO state.
2. The Search index is not in sync yet with the restored Storage records, so Reindex All is executed.
3. Reindex All should *not* be using the Schema get all schemas endpoint as that will retrieve every schema that has been defined in the installation which includes unused schemas and obsolete schemas and those may number in the thousands. Instead, Reindex All needs to use /query/kinds from Storage which will retrieve only those kinds actually in use in Storage.
4. As Reindex All executes, the list of kinds is retrieved from Storage /query/kinds and iterated over, triggering a reindex on each individual kind known to Storage.M10 - Release 0.13