Commit 82a20db2 authored by Jørgen Lind's avatar Jørgen Lind
Browse files

Merge branch feature/jorgen.lind/SEGYToolsReadme with refs/heads/master into...

Merge branch feature/jorgen.lind/SEGYToolsReadme with refs/heads/master into refs/merge-requests/80/train
parents 5facdea6 4db0d9f3
Pipeline #940 passed with stages
in 8 minutes and 2 seconds
## SEGYExport
A tool to export a volume data store (VDS) to a SEG-Y file.
Usage:
```
SEGYExport [OPTION...] <output file>
```
| Option | Decription |
|------------------------------|------------|
|--bucket \<string> | AWS S3 bucket to export from.
|--region \<string> | AWS region of bucket to export from.
|--connection-string \<string> | Azure Blob Storage connection string.
|--container \<string> | Azure Blob Storage container to export from .
|--parallelism-factor \<value> | Azure parallelism factor.
|--prefix \<string> | Top-level prefix to prepend to all object-keys.
|--persistentID \<ID> | A globally unique ID for the VDS, usually an 8-digit hexadecimal number.
|--output \<arg> | The output SEG-Y file.
SEGYExport is used to export from a VDS to a SEG-Y file. A VDS that was imported from a SEG-Y file stores
the SEGY-Y binary and text headers from the original SEG-Y (in the VDS metadata) and the original trace headers and live trace flags (in separate data channels) and these will be used to re-create the original SEG-Y file. The output file will only be identical to the original if the VDS was compressed with a lossless algorithm (or uncompressed) and all traces fit in the array defined by the VDS (i.e. no duplicate traces).
Either a `--container` (for Azure) or a `--bucket` (for AWS) argument, a `--persistentID` argument and an output SEG-Y file must be specified.
Example usage:
```
SEGYExport --bucket openvds-test --persistentID 7068247E9CA6EA05 D:\\Datasets\\Australia\\shakespeare3d_pstm_Time_export.segy
```
NOTE:
If the VDS does not contain SEG-Y headers, SEGYExport will not currently be able to export to a SEG-Y file, but this will be fixed in a later release.
......@@ -54,7 +54,7 @@ main(int argc, char *argv[])
options.add_option("", "", "container", "Azure Blob Storage container to export from .", cxxopts::value<std::string>(container), "<string>");
options.add_option("", "", "parallelism-factor", "Azure parallelism factor.", cxxopts::value<int>(azureParallelismFactor), "<value>");
options.add_option("", "", "prefix", "Top-level prefix to prepend to all object-keys.", cxxopts::value<std::string>(prefix), "<string>");
options.add_option("", "", "persistentID", "persistentID", cxxopts::value<std::string>(persistentID), "<ID>");
options.add_option("", "", "persistentID", "A globally unique ID for the VDS, usually an 8-digit hexadecimal number.", cxxopts::value<std::string>(persistentID), "<ID>");
options.add_option("", "", "output", "", cxxopts::value<std::string>(fileName), "");
options.parse_positional("output");
......
## SEGYImport
A tool to scan and import a SEG-Y file to a volume data store (VDS).
Usage:
```
SEGYImport [OPTION...] <input file>
```
| Option | Decription |
|---------|------------|
| -h, --header-format \<file> | A JSON file defining the header format for the input SEG-Y file. The expected format is a dictonary of strings (field names) to pairs (byte position, field width) where field width can be "TwoByte" or "FourByte". Additionally, an "Endianness" key can be specified as "BigEndian" or "LittleEndian". |
| -p, --primary-key \<field> | The name of the trace header field to use as the primary key. (default: Inline) |
| -s, --secondary-key \<field> | The name of the trace header field to use as the secondary key. (default: Crossline) |
| --scale \<value> | If a scale override (floating point) is given, it is used to scale the coordinates in the header instead of determining the scale factor from the coordinate scale trace header field. |
| -l, --little-endian | Force little-endian header fields. |
| --scan | Generate a JSON file containing information about the input SEG-Y file. |
| -i, --file-info \<file> | A JSON file (generated by the --scan option) containing information about the input SEG-Y
file. |
| -b, --brick-size \<value> | The brick size for the volume data store. (default: 64)
| -f, --force | Continue on upload error.
| --ignore-warnings | Ignore warnings about import parameters.
| --bucket \<string> | AWS S3 bucket to upload to.
| --source-bucket \<string> | AWS S3 bucket to download from.
| --region \<string> | AWS region of bucket to upload to.
| --connection-string \<string> | Azure Blob Storage connection string.
| --container \<string> | Azure Blob Storage container to upload to.
| --parallelism-factor \<value> | Azure parallelism factor.
| --prefix \<string> | Top-level prefix to prepend to all object-keys.
| --source-prefix \<string> | Top-level prefix to prepend to all source object-keys.
| --persistentID \<ID> | A globally unique ID for the VDS, usually an 8-digit hexadecimal number.
| --uniqueID | Generate a new globally unique ID when scanning the input SEG-Y file.
To create a valid VDS from a SEG-Y file, SEGYImport needs to scan the file to determine the extents of the dataset (e.g. number of samples, number of crosslines, number of inlines). During the scanning process we read from a number of fields in the trace headers, most importantly the primary and secondary keys that are used as the axes of the VDS.
For inline-sorted poststack data the inline number is the primary key and the crossline number is the secondary key (this is the default setting). If these are not in the 'standard' byte locations in the header, you can override the trace header format using a JSON file that contains definitions of the SEG-Y header fields (that are not in the standard locations) using the --header-format command line option. You can also specify the header field endianness in this file. This is an example of such a JSON file:
```
{
"Endianness": "BigEndian",
"InlineNumber": [ 5, "FourByte"],
"CrosslineNumber": [ 9, "FourByte"]
}
```
For other data types (or crossline-sorted poststack data) it is possible to specify which trace header fields the file is sorted on by using the `--primary-key` and `--secondary-key` options.
The result of the scanning process is the 'file info' and can optionally be saved to a separate file using the `--scan` option. Such a file can be used later when importing the data by using the
`--file-info` command line option.
If `--scan` is specified then `--file-info` argument can be used to specify the output
file. If no output file is given, the file info will be printed to stdout.
When SEGYImport is either done generating a "file-info" or it is supplied with
a file, it will start generating VDS chunks that will be uploaded to the destination VDS using the
connection parameters.
During the scanning stage SEGYImport will also read the binary header of
the SEG-Y file and extract some keys at certain predefined positions. These are not possible to override, since it's not common practice to store these in a different place.
| Name | Offset | Width |
|--------------------------------|--------|-------|
| TracesPerEnsemble | 13 | 2 |
| AuxiliaryTracesPerEnsemble | 15 | 2 |
| SampleInterval | 17 | 2 |
| NumSamples | 21 | 2 |
| DataSampleFormatCode | 25 | 2 |
| EnsembleFold | 27 | 2 |
| TraceSortingCode | 29 | 2 |
| MeasurementSystem | 55 | 2 |
| SEGYFormatRevisionNumber | 301 | 2 |
| FixedLengthTraceFlag | 303 | 2 |
| ExtendedTextualFileHeaderCount | 305 | 2 |
The default trace header fields (that can be overridden with a header format JSON file) are:
| Header Field Name | Aliases | Offset | Width |
|-----------------------|------------------------------------------|--------|-------|
| InlineNumber | Inline | 189 | 4 |
| CrosslineNumber | Crossline | 193 | 4 |
| EnsembleXCoordinate | CDPXCoordinate, CDP-X, Easting | 181 | 4 |
| EnsembleYCoordinate | CDPYCoordinate, CDP-Y, Northing | 185 | 4 |
| SourceXCoordinate | Source-X | 73 | 4 |
| SourceYCoordinate | Source-Y | 77 | 4 |
| GroupXCoordinate | Group-X, ReceiverXCoordinate, Receiver-X | 81 | 4 |
| GroupYCoordinate | Group-Y, ReceiverYCoordinate, Receiver-Y | 85 | 4 |
| CoordinateScale | Scalar | 71 | 2 |
Either a `--container` (for Azure) or a `--bucket` (for AWS) argument and an input SEG-Y file must be specified.
Example usage:
```
SEGYImport --bucket openvds-test --header-format D:\\Datasets\\Australia\\HeaderFormat.json D:\\Datasets\\Australia\\shakespeare3d_pstm_Time.segy
```
......@@ -1079,7 +1079,7 @@ main(int argc, char* argv[])
options.add_option("", "p", "primary-key", "The name of the trace header field to use as the primary key.", cxxopts::value<std::string>(primaryKey)->default_value("Inline"), "<field>");
options.add_option("", "s", "secondary-key", "The name of the trace header field to use as the secondary key.", cxxopts::value<std::string>(secondaryKey)->default_value("Crossline"), "<field>");
options.add_option("", "", "scale", "If a scale override (floating point) is given, it is used to scale the coordinates in the header instead of determining the scale factor from the coordinate scale trace header field.", cxxopts::value<double>(scale), "<value>");
options.add_option("", "l", "little-endian", "Force (non-standard) little-endian trace headers.", cxxopts::value<bool>(littleEndian), "");
options.add_option("", "l", "little-endian", "Force little-endian trace headers.", cxxopts::value<bool>(littleEndian), "");
options.add_option("", "", "scan", "Generate a JSON file containing information about the input SEG-Y file.", cxxopts::value<bool>(scan), "");
options.add_option("", "i", "file-info", "A JSON file (generated by the --scan option) containing information about the input SEG-Y file.", cxxopts::value<std::string>(fileInfoFileName), "<file>");
options.add_option("", "b", "brick-size", "The brick size for the volume data store.", cxxopts::value<int>(brickSize)->default_value("64"), "<value>");
......@@ -1093,8 +1093,8 @@ main(int argc, char* argv[])
options.add_option("", "", "parallelism-factor", "Azure parallelism factor.", cxxopts::value<int>(azureParallelismFactor), "<value>");
options.add_option("", "", "prefix", "Top-level prefix to prepend to all object-keys.", cxxopts::value<std::string>(prefix), "<string>");
options.add_option("", "", "source-prefix", "Top-level prefix to prepend to all source object-keys.", cxxopts::value<std::string>(sourcePrefix), "<string>");
options.add_option("", "", "persistentID", "persistentID", cxxopts::value<std::string>(persistentID), "<ID>");
options.add_option("", "", "uniqueID", "uniqueID", cxxopts::value<bool>(uniqueID), "<ID>");
options.add_option("", "", "persistentID", "A globally unique ID for the VDS, usually an 8-digit hexadecimal number.", cxxopts::value<std::string>(persistentID), "<ID>");
options.add_option("", "", "uniqueID", "Generate a new globally unique ID when scanning the input SEG-Y file.", cxxopts::value<bool>(uniqueID), "");
options.add_option("", "", "input", "", cxxopts::value<std::vector<std::string>>(fileNames), "");
options.parse_positional("input");
......
VDSInfo is a simple tool for getting info about a VDS. It prints out the result
of the query in json, and it will try and give the shortes json for the
specific query by eliminating redundant json parent structures.
## VDSInfo
A tool for extracting info from a VDS.
Usage:
```
VDSInfo [OPTION...]
```
| Option | Decription |
|-------------------------------|------------|
| --bucket \<string> | AWS S3 bucket to connect to.
| --region \<string> | AWS region of bucket to connect to.
| --connection-string \<string> | Azure Blob Storage connection string.
| --container \<string> | Azure Blob Storage container to connect to.
| --parallelism-factor \<value> | Azure parallelism factor.
| --prefix \<string> | Top-level prefix to prepend to all object-keys.
| --persistentID \<ID> | A globally unique ID for the VDS, usually an 8-digit hexadecimal number.
| --axis | Print axis descriptors.
| --channels | Print channel descriptors.
| --layout | Print layout.
| --metadatakeys | Print metadata keys.
| --metadata-name \<string> | Print metadata matching name.
| --metadata-category \<string> | Print metadata matching category.
| --metadata-firstblob | Print first blob found.
| --metadata-autodecode | Autodetect EBCDIC and decode to ASCII for blobs.
| --metadata-force-width \<arg> | Force output width.
VDSInfo prints out the result of the query in json, and it will try and give the shortest json for the specific query by eliminating redundant json parent structures.
Some examples:
```
......@@ -44,4 +70,3 @@ To force a width for BLOB printing use the `-w` parameter.
```
$ VDSInfo.exe --bucket <some_bucket> --region <a_region> --persistentID <some_vds_id> --metadata-name TextHeader -e -w 80
```
......@@ -198,7 +198,7 @@ int main(int argc, char **argv)
bool volumeDataLayout = false;
bool metaKeys = false;
bool metaDataFirstBlob = false;
bool metadataAutoDecodeEPCIDIC = false;
bool metadataAutoDecodeEBCDIC = false;
int textDecodeWidth = std::numeric_limits<int>::max();
//connection options
......@@ -208,19 +208,19 @@ int main(int argc, char **argv)
options.add_option("", "", "container", "Azure Blob Storage container to connect to.", cxxopts::value<std::string>(container), "<string>");
options.add_option("", "", "parallelism-factor", "Azure parallelism factor.", cxxopts::value<int>(azureParallelismFactor), "<value>");
options.add_option("", "", "prefix", "Top-level prefix to prepend to all object-keys.", cxxopts::value<std::string>(prefix), "<string>");
options.add_option("", "", "persistentID", "persistentID", cxxopts::value<std::string>(persistentID), "<ID>");
options.add_option("", "", "persistentID", "A globally unique ID for the VDS, usually an 8-digit hexadecimal number.", cxxopts::value<std::string>(persistentID), "<ID>");
///action
options.add_option("", "", "axis", "Print axis descriptors", cxxopts::value<bool>(axisDescriptors), "");
options.add_option("", "", "channels", "Print channel descriptors", cxxopts::value<bool>(channelDescriptors), "");
options.add_option("", "", "layout", "Print layout", cxxopts::value<bool>(volumeDataLayout), "");
options.add_option("", "", "axis", "Print axis descriptors.", cxxopts::value<bool>(axisDescriptors), "");
options.add_option("", "", "channels", "Print channel descriptors.", cxxopts::value<bool>(channelDescriptors), "");
options.add_option("", "", "layout", "Print layout.", cxxopts::value<bool>(volumeDataLayout), "");
options.add_option("", "", "metadatakeys", "Print metadata keys", cxxopts::value<bool>(metaKeys), "");
options.add_option("", "", "metadata-name", "Print metadata matching name", cxxopts::value<std::string>(metadataPrintName), "<string>");
options.add_option("", "", "metadata-category", "Print metadata matching category", cxxopts::value<std::string>(metadataPrintCategory), "<string>");
options.add_option("", "b", "metadata-firstblob", "Print first blob found", cxxopts::value<bool>(metaDataFirstBlob), "");
options.add_option("", "e", "metadata-autodecode", "Autodetect EPCIDIC and decode to ASCII for blobs", cxxopts::value<bool>(metadataAutoDecodeEPCIDIC), "");
options.add_option("", "w", "metadata-force-width", "Force output width", cxxopts::value<int>(textDecodeWidth), "");
options.add_option("", "", "metadatakeys", "Print metadata keys.", cxxopts::value<bool>(metaKeys), "");
options.add_option("", "", "metadata-name", "Print metadata matching name.", cxxopts::value<std::string>(metadataPrintName), "<string>");
options.add_option("", "", "metadata-category", "Print metadata matching category.", cxxopts::value<std::string>(metadataPrintCategory), "<string>");
options.add_option("", "b", "metadata-firstblob", "Print first blob found.", cxxopts::value<bool>(metaDataFirstBlob), "");
options.add_option("", "e", "metadata-autodecode", "Autodetect EBCDIC and decode to ASCII for blobs.", cxxopts::value<bool>(metadataAutoDecodeEBCDIC), "");
options.add_option("", "w", "metadata-force-width", "Force output width.", cxxopts::value<int>(textDecodeWidth), "");
if(argc == 1)
{
......@@ -341,12 +341,12 @@ int main(int argc, char **argv)
auto &key = to_print_blobs.front();
std::vector<uint8_t> vector;
layout->GetMetadataBLOB(key.category, key.name, vector);
bool decodeEPCIDIC = false;
if (metadataAutoDecodeEPCIDIC)
bool decodeEBCDIC = false;
if (metadataAutoDecodeEBCDIC)
{
decodeEPCIDIC = autodetectDecode(vector);
decodeEBCDIC = autodetectDecode(vector);
}
if (decodeEPCIDIC)
if (decodeEBCDIC)
{
decodedEbcdic(vector);
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment