URI as identifiers for the type and ‘internal’ location of a data container
URI as identifiers for the type and ‘internal’ location of a data container
Status
-
Initiated -
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
In a lot of cases, relatively small files are ingested into OSDU. In a Work Product, typically a work product component would point to a single file.
In larger file contexts, like VDS data, multiple WPCs may point to the same VDS container or for convenience a service exists that can extract a subset of the data in the container. In that case, a pointer to a complete file does not work (in the RESQML project this will be even more apparent)
There is a desire to point to ‘locations’ in the given storage format and not to a complete file. For this something like a URI can be used.
In streaming data platforms this is quite common and a construct like: spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40
- Spotify is the scheme name to which a well-known service is associated
- Track is the object you’re looking for
- #2 (closed):40 is the location in the track
For VDS something similar could be used: vds:c9465e4a498048f6b8a8ea1e9e984255:trace:8+9+10
- vds is the scheme name to which the vds service is associated
- c9465e4a498048f6b8a8ea1e9e984255 is the identifier of the container
- trace is the data-type
- 8+9+10 is a (made up) construct to retrieve trace number 8, 9 and 10
File would just be another scheme.
So when SEG-Y is loaded and transcoded into VDS format, the WPC’s can point to the SEG-Y File and the artifacts can point to the VDS container.
For this to work, there needs to be a service that understands the schema/codec. Out of the box services for OSDU can for example cover VDS or WITSML: providing the schema is the trigger for the platform to call the right service (e.g. enter a URI like spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40 on your phone and the platform (your phone) opens a song at 2:40 (radio-head warning).
Decision
URI as identifiers for the type and ‘internal’ location of a data container
Rationale
More flexible
Consequences
Have to change existing data definitions