Updating information stored in a VDS (Are VDS datasets mutable?)
I was playing around with OpenVDS to figure out whether and to what extent VDS datasets are mutable or not. My big question is: What parts of a VDS dataset are mutable. If there are any parts that are mutable, what is the correct way to change these parts.
Either way it has different implications for certain workflows (to me). This concerns metadata as well as channel data stored within the VDS.
- If a VDS dataset is always immutable, I can be sure that nobody with accidentally change/break a VDS dataset.
- If a VDS dataset is mutable, I could update some fields if, e.g.,
SEGYImport
does not accept certain names/units during ingestion and/or update data within the VDS, e.g. add fast slice, an additional channel or update data within a channel without recreating the VDS dataset from scratch.
I potentially do some "stupid" things here, but I also try to model a worst case scenario here like "what is the worst thing somebody can do wrong?".
I found an older issue which mentions the addition of LOD levels to a VDS, but it does not specify whether this would happen in-place or would create a new VDS dataset.
Observations / Experiments
-
I am able to add additional channel data to an existing channel of a VDS dataset. The additional data seems to hide the initially available channel data.
For testing I created a small Python script: create_and_change_vds_inplace_small.py. The script creates a small artifical VDS dataset with one channel. The script carries out the following steps.
- Create VDS dataset with all values in the channel are set to
1
. - Close file hande such that it can be written to disk.
- Open the file and extract a slice, copy the slice data and close the file.
- Open the file and get an AccessManager, write the value
2*old_value
(2
in this case) to the VDS dataset and close the file. - Open the file and extract a slice, copy the slice data and close the file.
- Plot the slice data.
When I run the script that the data extracted from the VDS indeed changes. For the first slice I get constant
1
values and constant2
values for the second time I extract a slice. I am not sure if one can still access the "old" data. The file size increases so it appears to me as if the old data is still stored in the VDS dataset.Is this behavior intended? If so, how can I access the "old" data? Would it be possible to actually update the data without increasing the file size?
- Create VDS dataset with all values in the channel are set to
-
I tried to update the metadata. For that I wrote a small C++ code as C++ seemed to have (more?) direct access to the
MetadataWriteAccess
object than Python. The code basically opens a specified VDS dataset and replaces theImportTimeStamp
values with some unrealistic time stamp.The code executes, but gives a segmentation fault (
Segmentation fault: 11
). From debugging I concluded that the segmentation fault rises when the VDS dataset is being closed. Calling theSetMetadataString
function seems to be fine.Is this behavior intended? I guess the segmentation fault should never happen, but to me it is not clear if that is an error when updating the VDS dataset or it is a side effect of illegally writing the metadata.
Platform
- Apple Arm M1 Max
- MacOS 13.1
- OpenVDS 3.0.3 (compiled from source)