Seismic issueshttps://community.opengroup.org/groups/osdu/platform/domain-data-mgmt-services/seismic/-/issues2024-01-24T08:31:20Zhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/217Reading concurrently2024-01-24T08:31:20ZVasilii SinkevichReading concurrentlyHi,
Not an issue, rather a question
I am getting familiar VDS and I am trying to read a slice of data from a VDS file, I split the slice (e.g., [1,0:1000,0:500]) into several portions along one axis (e.g.,[1,0:200,0:500],[1,200:400,0:5...Hi,
Not an issue, rather a question
I am getting familiar VDS and I am trying to read a slice of data from a VDS file, I split the slice (e.g., [1,0:1000,0:500]) into several portions along one axis (e.g.,[1,0:200,0:500],[1,200:400,0:500],[1,400:600,0:500],...) and try to read them with requestVolumeSubset concurrently using multiprocessing module, but even though the reading in each thread starts simultaneously (confirmed by text output), it looks like actual reading happens consecutively, one portion after another.
I tried opening vds file in the main thread and use the identifier in the threads (concurrent.futures allows it) and to open the file separately in each thread - in first case reading of each portion starts after previous has finished as if in a single thread, in the second case the reading starts simultaneously, but each portion is read way longer than normal taking overall same time as in the first case.
So the question is: Is there a some sort of queueing system for reading in the openvds library or it is just limitation of free version?
Can reading data by pages resolve it?
Sorry, no code snippet as I am not sure if I am allowed to post the code
Platform: Windows
API: Python
Thank you,
Vasiliihttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/125Patch dataset name issue2024-01-22T16:21:23ZYan Sushchynski (EPAM)Patch dataset name issueWe ran the [collection](https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/GC-M22/GC_OSDU_Smoke_Tests.postman_collection.json?ref_type=heads), and this request
```bash
curl --location --request PATCH 'https:/...We ran the [collection](https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/GC-M22/GC_OSDU_Smoke_Tests.postman_collection.json?ref_type=heads), and this request
```bash
curl --location --request PATCH 'https://<host>/api/seismic-store/v3/dataset/tenant/m19/subproject/subprojectodi374308/dataset/AutoTest_dsetodi831125?path=autotest_path' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: m19' \
--header 'Authorization: Bearer token' \
--data '{
"dataset_new_name": "autotest_new",
"metadata": {
"f1": "v1",
"f2": "v2",
"f3": "v3"
},
"filemetadata": {
"f1": "v1",
"f2": "v2",
"f3": "v3"
},
"last_modified_date": "Thu Jul 16 2020 04:37:41 GMT+0000 (Coordinated Universal Time)",
"gtags": [
"tag01",
"tag02",
"tag03"
],
"ltag": "m19-SeismicDMS-Legal-Tag-Test7649172",
"readonly": false,
"seismicmeta": {
"kind": "m19:seistore:seismic2d:1.0.0",
"legal": {
"legaltags": [
"m19-SeismicDMS-Legal-Tag-Test7649172"
],
"otherRelevantDataCountries": [
"US"
]
},
"data": {
"msg": "Auto Test sample data patched"
}
}
}'
```
And, we get the following error:
```bash
[seismic-store-service] The dataset sd://m19/subprojectodi374308/autotest_path/autotest_new already exists, even so there is no such a dataset in Seismic at the moment
```https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/122V4 API and Postman Collection showcasing the steps/sequence2024-01-11T17:24:56ZDebasis ChatterjeeV4 API and Postman Collection showcasing the steps/sequenceWe started to look at the collection provided by AWS
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/AWS-M22/DDMS%20Seismic/AWS_OSDUR3M22_Seismic_v4_Automated.postman_collection.json
This was apparently cre...We started to look at the collection provided by AWS
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M22/AWS-M22/DDMS%20Seismic/AWS_OSDUR3M22_Seismic_v4_Automated.postman_collection.json
This was apparently created with initial example from Dev team (Seismic DDMS).
We are a little unclear about the logical sequence and naming of the folder/requests.
Folder "Schema" is really to create some catalog record (Dataset FileCollection.SegY).
Folder "Connection" is apparently to upload some data files". Should this not be before we can create Dataset record?
Something similar to what we see here, as the sequence of steps.
![image](/uploads/e1579cc87851b5e8995c0892dde824f7/image.png)
Is the need for sdutil completely eliminated? Earlier, we had to upload data file (SegY) by using sdutil to suitable tenant and sub-project.
Perhaps a **companion document** with the **Postman Collection** would help.
@chad earlier mentioned that the DEV team would probably provide a video showing the steps?
Thank you
cc @spoddar , @kimjiman and @ydzenghttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/32Implement resumable file transfer (upload and download)2024-01-11T15:59:10ZSacha BrantsImplement resumable file transfer (upload and download)Given the size of data in Seismic DMS, users want to resume a file transfer (upload/upload).
It should make sure that there are no integrity issues.Given the size of data in Seismic DMS, users want to resume a file transfer (upload/upload).
It should make sure that there are no integrity issues.https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/107[ADR] Hierarchical deletion of datasets2024-01-05T10:29:26ZMaggie Salak[ADR] Hierarchical deletion of datasets# Introduction
We need a way to delete millions of datasets (including metadata and files in blob storage) in Seismic DMS. A single delete operation can include up to 50 million datasets.
The purpose of this ADR is to define the approa...# Introduction
We need a way to delete millions of datasets (including metadata and files in blob storage) in Seismic DMS. A single delete operation can include up to 50 million datasets.
The purpose of this ADR is to define the approach to implementing a hierarchical delete feature in SDMS.
# Status
* [x] Initiated
* [x] Proposed
* [x] Under Review
* [ ] Approved
* [ ] Rejected
# Problem statement
SDMS API currently exposes the following endpoints for deleting datasets:
- `DELETE /dataset/tenant/{tenantid}/subproject/{subprojectid}/dataset/{datasetid}`
Deletes a single dataset.
- `DELETE /subproject/tenant/{tenantid}/subproject/{subprojectid}`
Deletes a subproject.
The endpoint deleting a subproject currently does not scale to the required number of datasets. The current implementation also leaves a possibility of an inconsistent state between the metadata and files in blob storage - in case some of the files fail to be deleted, the deletion of metadata associated with these datasets is not reverted.
SDMS currently does not have the functionality of deleting only selected datasets in a subproject, filtered by path, tags, labels, etc.
# Proposed Solution
In short:
- Create new API endpoints to support starting and tracking progress of the asynchronous deletion operation.
- Deploy a new service on k8s that would asynchronously delete datasets.
## Overview
We will introduce the bulk-delete feature as follows:
1. Implement and deploy a separate application to the same K8s cluster: the _deletion service_.
This service will accept the bulk deletion requests from SDMS API, perform the deletion and keep track of the progress of this long-running operation.
2. Add the new endpoint to SDMS API to delete all datasets in a specified path:
`PUT /operations/bulk-delete?sdpath={sdpath}`
Status: 202 Accepted
`sdpath` in the format `sd://tenant/subproject/path`
Response schema:
```json
{
"operationId": "{string}"
}
```
3. Add the new endpoint to SDMS API to view the status and progress of the delete operation:
`GET /operations/bulk-delete/status/{operationid}`
Status: 200 OK
Response schema:
```json
{
"OperationId": "{string}",
"CreatedAt": "{string}",
"CreatedBy": "{string}",
"LastUpdatedAt": "{string}",
"Status": "{string}",
"DatasetsCnt": "{int}",
"DeletedCnt": "{int}",
"FailedCnt": "{int}"
}
```
Headers will contain `data-partition-id` information to check if the user is registered in the partition before retrieving the operation status.
## Details
### Initiating a delete operation
- The new `PUT` endpoint will support the following cases for the dataset path, provided in the `sdpath` parameter:
- `path = /<path>/` - all datasets under the specified path should be deleted.
- path not specified - all datasets in the subproject should be deleted.
If the deletion of the subproject (metadata and container) is desired as well, the clients should call the delete subproject endpoint after the datasets bulk delete operation completes to ensure non-blocking deletion of the subproject in case it is composed by many datasets.
- The endpoint triggers the deletion job and returns the ID of the initiated operation.
- The delete operation is initiated in SDMS by pushing a message onto a queue (Azure Storage queue in case of Azure implementation; a different queuing mechanism can be used by other CSPs); the message contains the `operationId` and the parameters from the original request (tenant, subproject, path).
### Deletion service
Deletion service is a separate component from SDMS API, deployed to the same K8s cluster. The implementation details of the service can be decided by the individual CSPs. This section describes the proposed implementation for Azure.
The source code of the new component will be contributed to the Sidecar solution in the `seismic-store-service` repository.
The logic of the deletion service will work as follows:
- The service consumes the message from the Azure Storage queue and initiates the deletion process.
- All items (dataset IDs and `gcsurl` which determines the location in blob storage) matching the provided subproject and path are retrieved from Cosmos database.
- For each dataset, the deletion service checks if it is locked.
- If yes, the item is discarded from the delete operation.
- If not, the deletion service locks the dataset. The lock value in this case will contain a string indicating that the dataset is locked for deletion (e.g. WDELETE). This will allow another delete operation to delete the dataset if the deletion failed previously. However, it will prevent deletion of datasets locked with a regular write lock which would indicate that it is being actively used.
- The retrieved items are added to storage which allows the deletion service to keep track of the datasets to delete. In the first version of the implementation, the deletion service will store the retrieved datasets in memory.
In a later phase we are planning to use a persistent storage (e.g. Service Bus queue) to store the items to be deleted. This will allow the service to resume deletion after a restart as well as retry deletion for the datasets where it failed.
- The deletion service leverages existing Redis queues to keep track of the overall deletion operation status and progress.
- The deletion service retrieves and deletes the datasets by checking the store containing items to be deleted. In the first version of the implementation it simply iterates over items stored in memory.
- The datasets are processed in batches; for each batch we retrieve the associated blobs from the storage account using the `gcsurl` property of the metadata.
- The blobs from the current batch are deleted.
- We then delete the metadata documents from Cosmos DB, leaving the ones for which the blob deletion was unsuccessful. We consider that the deletion was successful if the blobs were not found as we assume they were deleted earlier.
- The deletion status is updated in Redis after processing every dataset.
- At the end, the status of a completed operation (with errors or without) is saved in Redis.
- The deletion status should not be deleted at this point so that users can query the operation status after completion.
### Sequence diagram for the deletion operation
![deletion_diagram_osdu](/uploads/b097c46896644e19a7374df96560aabd/deletion_diagram_updated.png)
### Deletion status
The status of delete operations will be saved in Redis.
It will be written by the deletion service (updated with the current progress) and read by SDMS API
(when users request the deletion status).
SDMS API and the deletion service will agree on the naming convention for the key in Redis,
e.g. `deletequeue:status:{operationId}`.
The new `GET` endpoint allowing users to query the status of a delete operation will return the following information:
- **`OperationId`** - ID of the delete operation.
- **`Status`** - Current status of the delete operation; possible values are: 'Not started', 'Started', 'In progress', 'Completed', 'Completed with errors'.
- `CreatedAt` - Timestamp of the creation of the delete operation.
- `CreatedBy` - Entity initiating the delete operation.
- `LastUpdatedAt` - Timestamp of the last status update of the delete operation.
- `DatasetsCnt` - Total number of datasets to be deleted; initially not set, until the enumeration of datasets for deletion is completed.
- `DeletedCnt` - Number of deleted datasets; updated after each dataset processed by the deletion service, after both blobs and metadata are deleted.
- `FailedCnt` - Number of datasets for which the deletion failed; updated after each dataset processed by the deletion service if a failure occurs.
_(only the fileds in **bold** are required)_
_(dataset counts will be empty if the status is "not started")_
### Sequence diagram for the deletion status
![deletion_status_diagram](/uploads/52b27cfb56a9942cf7628e81aeb41eec/deletion_status_diagram.png)
# Out of scope / limitations
- Detailed statistics about datasets which failed to be deleted. In the first phase of implementation the deletion status endpoint will provide aggregated statistics as mentioned in the `Deletion status` section. Users will need to refer to logs to find out which datasets failed to be deleted.
- The bulk-delete feature does not guarantee the operation can continue after a restart of the deletion service. It will be up to the different CSPs to determine if there is retry logic for failed datasets or recovery support built into the service.
- Deleting 'orphan' blobs with missing metadata. Files without metadata containing a matching `gcsurl` will not be deleted as part of the delete operation as metadata is the source of truth for which blobs need to be deleted.
- Identifying blobs belonging to a different dataset but located in the same virtual folder as files of another dataset. Since `gcsurl` carries information about the location of files to be deleted, the delete operation will not be able to detect 'unrelated' files erroneously uploaded with the same virtual folder.
# Consequences
The same bulk deletion API endpoints can be implemented by any CSPs besides Azure.
The status endpoint is not CSP-specific. As long as the bulk delete implementation saves
the job status with the same schema to Redis, the status endpoint will work for any other CSP out of the box.M22 - Release 0.25Diego MolteniMark YanMaggie SalakSneha PoddarDiego Moltenihttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-zgy/-/issues/31Build fix to allow building the native library with clang 162024-01-04T08:24:23ZJon JenssenBuild fix to allow building the native library with clang 16Clang 16 requires an additional #include <cstdint> to be added to structaccess.h for the build to work.
See attach patch.
[clang16_patch_for_structaccess.diff](/uploads/a2dfaf3f64562fe857db97ef344f199e/clang16_patch_for_structaccess.diff)Clang 16 requires an additional #include <cstdint> to be added to structaccess.h for the build to work.
See attach patch.
[clang16_patch_for_structaccess.diff](/uploads/a2dfaf3f64562fe857db97ef344f199e/clang16_patch_for_structaccess.diff)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-zgy/-/issues/30reading zgy directly from cloud2023-12-27T10:09:52ZQiang Fureading zgy directly from cloudAny example to read zgy files on cloud from providers by the OpenZGY if it is supported?Any example to read zgy files on cloud from providers by the OpenZGY if it is supported?https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/108[ADR] Hierarchical data distribution statistics based on path - API endpoint2023-12-21T12:00:55ZIzabela Kulakowska[ADR] Hierarchical data distribution statistics based on path - API endpoint# Introduction
We need a solution for retrieving dataset statistics currently consisting of only dataset sizes.
The purpose of this ADR is to define the approach for retrieving the hierarchical data distribution statistics based on a p...# Introduction
We need a solution for retrieving dataset statistics currently consisting of only dataset sizes.
The purpose of this ADR is to define the approach for retrieving the hierarchical data distribution statistics based on a path.
# Status
* [x] Initiated
* [x] Proposed
* [x] Under Review
* [ ] Approved
* [ ] Rejected
# Problem statement
The SDMS API currently exposes the following endpoints for managing the datasets sizes:
- `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}/dataset/{datasetid}/size` - computes the actual dataset size and updates the dataset metadata `computed_size` field.
- (deprecated) `GET /dataset/tenant/{tenantid}/subproject/{subprojectid}/sizes` - fetches the sizes of the datasets based on the metadata field `filemetadata.size`.
# Proposed solution
Create new API endpoint for retrieving the total size value for a dataset, a subfolder and a subproject. The new endpoint would require _viewer_ or _admin_ roles.
## Overview
```bash
GET /dataset/tenant/{tenant}/subproject/{subproject}/size?path={path}&datasetid={datasetname}
```
Path parameters:
- **tenant** - tenant
- **subproject** - subproject
Query parameters:
- **path** - folder path for which the analytics are going to be retrieved [mandatory if query parameter `{datasetid}` is specified]
- **datasetid** - dataset name for which the analytics are going to be retrieved
Response:
HTTP 200
```json
{
"dataset_count": 9999,
"size_bytes": 1024
}
```
- **dataset_count** - number of datasets under a specific subproject/folder
- **size_bytes** - sum of sizes [B] of all datasets under a specific subproject/folder or for a specific dataset
### Examples:
- `GET /dataset/tenant/tenant1/subproject/subproject1/size` - fetch and sum sizes of all datasets in the `subproject1`
- `GET /dataset/tenant/tenant1/subproject/subproject1/size&path=folderA/folderB` - fetch and sum sizes of all datasets under the folder path `folderA/folderB` in subproject `subproject1`
- `GET /dataset/tenant/tenant1/subproject/subproject1/size&path=folderA/folderB&datasetid=file.txt` - fetch the size of a dataset with a name `file.txt` that resides under the folder path `folderA/folderB` in subproject `subproject1`
## Details
Currently, two fields in the dataset metadata record can store information about the dataset size: `filemetadata.size` and `computed_size`. `filemetadata.size` is being used by the SDK on the client side, `computed_size` is intended to be computed and ingested on the server side.
To make sure the chosen field can be a reliable source of truth, the API endpoint implementation will calculate the sum of dataset sizes based on `compute_size` field.
# Out of scope / limitations
A challenge with using `computed_size` field as a source of truth is that some datasets may not have this property calculated, as currently the only way to update this value is by manually calling the `Compute Size` POST endpoint.
The solution to ensure the reliability of the value of the `computed_size` field will be the subject of a separate ADR.M22 - Release 0.25Izabela KulakowskaIzabela Kulakowskahttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/221Return the file size as output of the conversion of segy-vds2023-12-14T12:11:11ZDeepa KumariReturn the file size as output of the conversion of segy-vdsLinked issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/segy-to-vds-conversion/-/issues/17#note_261315
We need to be able to capture the file size of the VDS file which is the output of the segy-vds conversion.
...Linked issue: https://community.opengroup.org/osdu/platform/data-flow/ingestion/segy-to-vds-conversion/-/issues/17#note_261315
We need to be able to capture the file size of the VDS file which is the output of the segy-vds conversion.
Otherwise there will be an additional call to SDMS service.
However, we'd like to find out if the response could be tailored to add more informationhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/117[ADR] Advanced filters for dataset search2023-12-04T14:23:23ZAlexandre Gattiker[ADR] Advanced filters for dataset search# Introduction
We need additional filtering support to be able to filter the `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}` and `PUT /operation/bulk-delete` (added in [!891](https://community.opengroup.org/osdu/platform/dom...# Introduction
We need additional filtering support to be able to filter the `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}` and `PUT /operation/bulk-delete` (added in [!891](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/merge_requests/891/diffs#fafb01a8314993d61fca390beef912c7813278eb)) operations by metadata fields with more complex expressions than a single key-value match.
# Status
* [x] Initiated
* [x] Proposed
* [x] Under Review
* [ ] Approved
* [ ] Rejected
# Problem statement
The SDMS API `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}` currently accepts the following body parameters, among others:
* `search`, a single SQL-like search parameter, for example: `search=name=file%`
* `gtags`, an array of strings matching tags associated with dataset metadata.
The `search` field does not support more than one field, or more than one possible value for a field.
The SDMS API `PUT /operation/bulk-delete` (added in [!891](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/merge_requests/891/diffs#fafb01a8314993d61fca390beef912c7813278eb)) requires a `path` parameter containing `tenantid`, `subprojectid` and `path` but does not support filtering by metadata fields or tags.
For both search and delete, we need to be able to filter by more than one field, or more than one possible value for a field.
Furthermore, we expect a need for more complex filter solutions, such as combining `AND`, `OR` and `NOT` operators. The proposed solution should ideally be extensible to support additional expressions and operators in the future if needed.
# Proposed solution
Add an optional `filter` parameter to the `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}` and `PUT /operation/bulk-delete` API endpoints.
The `search` and `gtags` parameters are to be deprecated.
## Overview
The `filter` parameter can take a payload with a variable format, allowing expressing a simple filter on a single field, as well as logical combinations of filters with arbitrary complexity.
The `POST /dataset/tenant/{tenantid}/subproject/{subprojectid}` operation has been selected for extension because:
* Advanced metadata filtering, encompassing select and search functionalities, has already been incorporated into that operation.
* The SDMS API also accepts the `GET` method for the operation with parameters provided in the query string, as a legacy endpoint. The `POST` version of the endpoints has been introduced to address issues related to handling large request parameters, where sending the cursor as a query parameter can lead to oversized requests and subsequent failures.
## Examples
Example value for the `filter` parameter:
```json
{
"and": [
{
"not": {
"property": "gtags",
"operator": "CONTAINS",
"value": "tagA"
}
},
{
"or": [
{
"property": "name",
"operator": "LIKE",
"value": "test.%"
},
{
"property": "name",
"operator": "=",
"value": "dataset.sgy"
}
]
}
]
}
```
This is equivalent to the following pseudo-SQL statement:
```sql
SELECT * FROM datasets d WHERE
NOT (EXISTS (SELECT VALUE 1 FROM t IN d.data.gtags WHERE t = 'tagA')
OR (IS_STRING(d.data.gtags) AND STRINGEQUALS(d.data.gtags, 'tagA')))
AND (
d.name LIKE 'test.%'
OR d.name = 'dataset.sgy'
)
```
## Details
The `filter` parameter can be:
* A **property match filter**:
```json
{
"property": "...",
"operator": "...",
"value": "..."
}
```
The implementation will be extensible with additional keys if needed in the future, e.g. to specify case sensitivity.
* An **`and` or `or` filter**, i.e. an object containing only the key `and` or `or`, of which the value is an array of one or more filters (i.e. a property match filter or an `and`, `or` or `not` filter)
```json
{
"and": [...]
}
```
* A **`not` filter**, i.e. an object containing only the key `not`, of which the value is a filter (i.e. a property match filter or an `and`, `or` or `not` filter)
```json
{
"not": ...
}
```
# Out of scope / limitations
The operations at `GET /utility/ls` and `POST /utility/ls` can also be used for retrieving datasets, but will not be extended with advanced filtering at the moment. That functionality can be added later if required.Diego MolteniDiego Moltenihttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/220Add CRS database lookup to SEGYImport tool2023-11-16T12:21:10ZMorten OfstadAdd CRS database lookup to SEGYImport toolIt would be so much easier if it was possible to specify CRS by using the EPSG ID or UTM zone to look up the CRSWkt automatically, e.g. like this: --crs EPSG:23031 or --crs UTM-31N to get the WKT
```
PROJCS["ED50 / UTM zone 31N",GEOGCS["...It would be so much easier if it was possible to specify CRS by using the EPSG ID or UTM zone to look up the CRSWkt automatically, e.g. like this: --crs EPSG:23031 or --crs UTM-31N to get the WKT
```
PROJCS["ED50 / UTM zone 31N",GEOGCS["ED50",DATUM["European_Datum_1950",SPHEROID["International 1924",6378388,297],TOWGS84[-87,-98,-121,0,0,0,0]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4230"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",3],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","23031"]]
```
I looked up this at [https://epsg.io/23031](https://epsg.io/23031)https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/209Adding CRS to a VDS generated by Openvds+2023-11-16T12:21:10ZJuliana Fernandesjuliana.fernandes@iesbrazil.com.brAdding CRS to a VDS generated by Openvds+Hello,
I was taking a look into the doccumentation in order to add CRS to the VDS I'm generating with Openvds+.
In the doccumentation I saw the command "–crs-wkt <string>". The WKT is a Well-known Text and seems to be a geographical c...Hello,
I was taking a look into the doccumentation in order to add CRS to the VDS I'm generating with Openvds+.
In the doccumentation I saw the command "–crs-wkt <string>". The WKT is a Well-known Text and seems to be a geographical coordinate. There is a way to add a UTM coordinate to the data?
Regards,
Julianahttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/218OpenVDS cuts one char from folder name in GC path2023-11-15T11:51:32ZDzmitry Malkevich (EPAM)OpenVDS cuts one char from folder name in GC pathWe've found issue with OpenVDS 3.2.7 (and possible with all 3.2.*) and 3.3.1 versions in GC: first character in folder name is lost in path to SEGY file.
We have Seismic dataset:
```json
{
"sbit_count": 0,
"last_modified_date": ...We've found issue with OpenVDS 3.2.7 (and possible with all 3.2.*) and 3.3.1 versions in GC: first character in folder name is lost in path to SEGY file.
We have Seismic dataset:
```json
{
"sbit_count": 0,
"last_modified_date": "Wed Sep 27 2023 18:04:33 GMT+0000 (Coordinated Universal Time)",
"created_by": "109239448567816450362",
"sbit": null,
"subproject": "fgx",
"path": "/",
"gcsurl": "osdu-data-prod-m19-ss-seismic/5a3a7d7b-3a94-4d7a-9bcb-68c133d19e77/96bd9293-3358-4716-8a04-23806f63053e",
"readonly": false,
"filemetadata": {
"md5Checksum": null,
"nobjects": 1,
"size": 277427976,
"type": "GENERIC",
"tier_class": null
},
"name": "ST0202R08_PS_PSDM_RAW_PP_TIME.MIG_RAW.POST_STACK.3D.JS-017534.segy",
"ctag": "l7nnwm4Onkcjg79Dm19;m19",
"created_date": "Wed Sep 27 2023 18:04:03 GMT+0000 (Coordinated Universal Time)",
"ltag": "m19-seismic-DDMS-Legal-Tag-PRFC",
"tenant": "m19",
"access_policy": "uniform"
}
```
and Seismic path is `sd://m19/fgx/ST0202R08_PS_PSDM_RAW_PP_TIME.MIG_RAW.POST_STACK.3D.JS-017534.segy`.
When we run SEGY-to-VDS conversion OpenVDS is trying to download this file from `https://storage.googleapis.com/osdu-data-prod-m19-ss-seismic/a3a7d7b-3a94-4d7a-9bcb-68c133d19e77/96bd9293-3358-4716-8a04-23806f63053e/0` and fails as URL is not correct and first character in folder name is missing. In this case correct path should be `https://storage.googleapis.com/osdu-data-prod-m19-ss-seismic/5a3a7d7b-3a94-4d7a-9bcb-68c133d19e77/96bd9293-3358-4716-8a04-23806f63053e/0`.
As result conversion fails:
```text
[2023-10-27, 09:36:52 UTC] {pod_launcher.py:198} INFO - Event: segy-vds-conversion.1ff9faad21db4204a03ac62025f93454 had an event of type Running
[2023-10-27, 09:36:52 UTC] {pod_launcher.py:149} INFO - [Could not open input file] sd://m19/fgx/ST0202R08_PS_PSDM_RAW_PP_TIME.MIG_RAW.POST_STACK.3D.JS-017534.segy: Http error response: 403 -> https://storage.googleapis.com/osdu-data-prod-m19-ss-seismic/a3a7d7b-3a94-4d7a-9bcb-68c133d19e77/96bd9293-3358-4716-8a04-23806f63053e/0
[2023-10-27, 09:36:53 UTC] {pod_launcher.py:198} INFO - Event: segy-vds-conversion.1ff9faad21db4204a03ac62025f93454 had an event of type Running
```
Conversion works with image community.opengroup.org:5555/osdu/platform/domain-data-mgmt-services/seismic/open-vds/openvds-ingestion:latest which seems to be version 3.1.41
Please check and advise as this affects M21 Pre-shipping testing.
cc: @Yan_Sushchynski , @Yauhen_ShaliouMorten OfstadMorten Ofstadhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/210Error uploading VDS into SD Path using OpenVDS+2023-11-14T22:11:13ZJuliana Fernandesjuliana.fernandes@iesbrazil.com.brError uploading VDS into SD Path using OpenVDS+Hello,
I`m trying to upload a local VDS into a SD Path in AWS M20 pre-shipping and get an SDMS error (wrong location).
Some values where provided by e-mail, so I will not paste here, but if you need to test just let me know.
The command...Hello,
I`m trying to upload a local VDS into a SD Path in AWS M20 pre-shipping and get an SDMS error (wrong location).
Some values where provided by e-mail, so I will not paste here, but if you need to test just let me know.
The command I'm using is:
```
VDSCopy.exe -d "SdAuthorityUrl=https://prsh.testing.preshiptesting.osdu.aws/api/seismic-store/v3;SdApiKey=ABC;AuthTokenUrl={{received_by_email}};client_id={{received_by_email}};client_secret={{received_by_email}};grant_type=refresh_token;refresh_token={{generated_in_the_login}};LegalTag=osdu-public-usa-dataset-1;scopes=openid email;Region=us-east-2" E:\Juliana\osdu\osdu_test\ST0202R08_PS_PSDM_RAW_PP_TIME_MIG_RAW_POST_STACK_3D_JS_017534_tol1_JFA.vds sd://osdu/vdstestsjfa/ST0202R08_PS_PSDM_RAW_PP_TIME_MIG_RAW_POST_STACK_3D_JS_017534_tol1_JFA.vds
```
And the error I'm getting is:
```
[Could not create VDS sd://osdu/vdstestsjfa/test/ST0202R08_PS_PSDM_RAW_PP_TIME_MIG_RAW_POST_STACK_3D_JS_017534_tol1_JFA.vds] Error on uploading VolumeDataLayout object: Http error response: 301 -> https://psosdu-shared-seismicddms-20230814174725984500000004.s3.us-east-1.amazonaws.com/3o7c5j88s1ko0oyg/2b9b212b-21b5-4ccf-aaed-67485c113ae4/VolumeDataLayout: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
```
Seems to be accessing the wrong location since the instance is located at us-east-2.
Regards,
Juliana.https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/121[SAST] Client_Privacy_Violation in file queue.ts2023-11-13T15:53:55ZYauhen Shaliou [EPAM/GCP][SAST] Client_Privacy_Violation in file queue.ts**Description**
Method setup at line 42 of \\seismic-store-service\\app\\sdms\\src\\cloud\\shared\\queue.ts sends user information outside the application. This may constitute a Privacy Violation.
<table>
<tr>
<th> </th>
<th>Source</th...**Description**
Method setup at line 42 of \\seismic-store-service\\app\\sdms\\src\\cloud\\shared\\queue.ts sends user information outside the application. This may constitute a Privacy Violation.
<table>
<tr>
<th> </th>
<th>Source</th>
<th>Destination</th>
</tr>
<tr>
<th>File</th>
<td>seismic-store-service/app/sdms/src/cloud/shared/queue.ts</td>
<td>seismic-store-service/app/sdms/src/cloud/providers/azure/insights.ts</td>
</tr>
<tr>
<th>Line number</th>
<td>42</td>
<td>129</td>
</tr>
<tr>
<th>Object</th>
<td>password</td>
<td>log</td>
</tr>
<tr>
<th>Code line</th>
<td>redisOptions.password = cacheParams.KEY;</td>
<td>console.log(data);</td>
</tr>
</table>https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/120[SAST] SSL_Verification_Bypass in file cosmosdb.ts2023-11-13T15:22:49ZYauhen Shaliou [EPAM/GCP][SAST] SSL_Verification_Bypass in file cosmosdb.ts# **Location:**
<table>
<tr>
<th> </th>
<th>
</th>
<th>Destination</th>
</tr>
<tr>
<th>File</th>
<td>
</td>
<td>seismic-store-service/app/sdms/src/cloud/providers/azure/cosmosdb.ts</td>
</tr>
<tr>
<th>Line number</th>
<td>
</td>
<td>...# **Location:**
<table>
<tr>
<th> </th>
<th>
</th>
<th>Destination</th>
</tr>
<tr>
<th>File</th>
<td>
</td>
<td>seismic-store-service/app/sdms/src/cloud/providers/azure/cosmosdb.ts</td>
</tr>
<tr>
<th>Line number</th>
<td>
</td>
<td>67</td>
</tr>
<tr>
<th>Object</th>
<td>
</td>
<td>rejectUnauthorized</td>
</tr>
<tr>
<th>Code line</th>
<td>
</td>
<td>rejectUnauthorized: false</td>
</tr>
</table>
**Description**
\\seismic-store-service\\app\\sdms\\src\\cloud\\providers\\azure\\cosmosdb.ts relies HTTPS requests, in constructor. The rejectUnauthorized parameter, at line 67, effectively disables verification of the SSL certificate trust chain.
JavaScript Explicitly Disabling Certificate Verification var https = require('https'); var options = { hostname: 'domain.com', port: 443, path: '/', method: 'GET', rejectUnauthorized: false; }; options.agent = new https.Agent(options); var req = https.request(options, function(res) { res.on('data', function(d) { handleRequest(d); }); }); req.end();https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-vds/-/issues/215The sd protocol is failing for IBM2023-10-27T15:17:11ZAnuj GuptaThe sd protocol is failing for IBMThe sd protocol is failing for IBM while trying to call `vds = openvds.open(url, con)` is resulting in 404 error and seems the characters after `/` is getting escaped/skipped
If path is `ss-dev-seismic-dh2cqj2dwyr3tsz9/f013db48-47f5-430...The sd protocol is failing for IBM while trying to call `vds = openvds.open(url, con)` is resulting in 404 error and seems the characters after `/` is getting escaped/skipped
If path is `ss-dev-seismic-dh2cqj2dwyr3tsz9/f013db48-47f5-430b-a10e-c5f6622712d2 `
the bucket name is : ss-dev-seismic-dh2cqj2dwyr3tsz9
subpath/key : f013db48-47f5-430b-a10e-c5f6622712d2
where as the subpath/key is:
`013db48-47f5-430b-a10e-c5f6622712d2` (~~f~~013db48-47f5-430b-a10e-c5f6622712d2)2023-10-13https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/67Security vulnerability in transient dependency (Async) of Cloudant dependency2023-10-26T14:49:26ZWalter DSecurity vulnerability in transient dependency (Async) of Cloudant dependencyHight Severity vulnerability: Prototype Pollution in async - https://github.com/advisories/GHSA-fwr7-v2mv-hh25
Audit report:
async 2.0.0 - 2.6.3
Severity: high
Prototype Pollution in async - https://github.com/advisories/GHSA-fwr7-v...Hight Severity vulnerability: Prototype Pollution in async - https://github.com/advisories/GHSA-fwr7-v2mv-hh25
Audit report:
async 2.0.0 - 2.6.3
Severity: high
Prototype Pollution in async - https://github.com/advisories/GHSA-fwr7-v2mv-hh25
No fix available
node_modules/@cloudant/cloudant/node_modules/async
@cloudant/cloudant *
Depends on vulnerable versions of async
node_modules/@cloudant/cloudantWalter DWalter Dhttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-sdutil/-/issues/29Add verification that the Seismic's cloud provider matches the one from the c...2023-10-26T12:19:36ZYan Sushchynski (EPAM)Add verification that the Seismic's cloud provider matches the one from the config.yamlHello,
Recently we introduced a new implementation of Seismic for Google cloud. It mostly follows almost the same workflow as the previous google implementation, but there are some cruicial differences, and we got unpredicted results. A...Hello,
Recently we introduced a new implementation of Seismic for Google cloud. It mostly follows almost the same workflow as the previous google implementation, but there are some cruicial differences, and we got unpredicted results. As far as I remember, the service's respnses have information about cloud providers.
What if some extra checks are added?
ThanksDiego MolteniMark YanDiego Moltenihttps://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/issues/119Rename "IStorage" methods for v42023-10-24T09:09:14ZYan Sushchynski (EPAM)Rename "IStorage" methods for v4Hello,
I noticed that the cloud-storage [interface](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/blob/master/app/sdms-v4/src/cloud/storage.ts?ref_type=heads#L1...Hello,
I noticed that the cloud-storage [interface](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/blob/master/app/sdms-v4/src/cloud/storage.ts?ref_type=heads#L19) has the following methods:
```
createBucket(bucketName: string): Promise<void>;
bucketExists(bucketName: string): Promise<boolean>;
deleteBucket(bucketName: string): Promise<void>;
```
These method names suggest that new buckets are getting created, checked for existence, or deleted within a single data-partition. However, the GC and Baremetal implementations are different -- a data-partition is expected to work with its own pre-created bucket instead of creating new ones. This discrepancy between the method names and their actual functionality could lead to confusion and misunderstanding.
A similar situation exists in the AWS implementation, where comments had to be added to clarify that 'bucketNames' are actually BLOBs, which can be seen [here](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/seismic-dms-suite/seismic-store-service/-/blob/master/app/sdms-v4/src/cloud/providers/aws/storage.ts?ref_type=heads#L45).
I propose that we consider renaming these methods to more accurately reflect their functionality and create a better alignment with the actual implementation.
Thank you.Diego MolteniYunhua KoglinSacha BrantsMark YanDiego Molteni