Allow VDSCopy to overwrite existing SDMS dataset: `--allow-overwrite`
Hello,
first of all, thank you very much for the great 3.1
release, and re-implementation of IOManagers
, for SDMS, it's been very helpful as sdapi
was troublesome.
While examining new IOManagers with OPENVDS_DMS_CURL=1
, I have noticed a change in the behavior, that prevents us from writing bulk trace data (VDS content) into the previously created dataset (with the metadata we require), that doesn't yet have any data loaded.
OPENVDS_DMS_CURL=1 AWS_REGION=us-east-1 VDSCopy ./data/syntethic_data.wavelet.vds sd://osdu/<test-project>/demo-1
[CURL http respons error 409. Automatic rety https://<REDACTED>/api/seismic-store/v3/dataset/tenant/osdu/subproject/<REDACTED>/dataset/demo-1?path=%2F]
...
[Could not create VDS sd://osdu/<REDACTED>/gsi-demo-1] Seismic dms lock failed: Http error respons: 409 -> https://<REDACTED>/api/seismic-store/v3/dataset/tenant/osdu/subproject/<REDACTED>/dataset/demo-1?path=%2F
- [seismic-store-service] The dataset sd://osdu/<REDACTED>/demo-1 already exists[seismic-store-service]
when it's run through old DMS flow (using sdapi
), it's happy to ignore 409, and proceed to write it, command for the reference, but it's results in seg-fault
at the end (it was reported here #123 (closed))
AWS_REGION=us-east-1 VDSCopy ./data/syntethic_data.wavelet.vds sd://osdu/<test-project>/demo-1
Is it possible to add --allow-overwrite
similar to the flag available in VDSUploader.sh
in HueSpace SDK, which would ignore 409, when the dataset was created previously?
Rationale: we want to have control over how the dataset is created in SDMS, for data-lineage reasons, for which we need to create it ourselves, and then populate sd location with VDS content.
Regards, Filip
PS. Is there a way to submit a feature request for VDSUploader.sh
as well? If so what's the best channel? We've been exploring both tools for loading bulk data to OSDU, and we're seeing some gaps e.g. in S3 auth (only profile/dedicated role auth flow available), and no control over the dataset name when loading to SDMS, it generates random name, which prevents from targeting previously created dataset.