URI as identifiers for the type and ‘internal’ location of a data container

URI as identifiers for the type and ‘internal’ location of a data container

Status

  • Initiated
  • Proposed
  • Trialing
  • Under review
  • Approved
  • Retired

Context & Scope

In a lot of cases, relatively small files are ingested into OSDU. In a Work Product, typically a work product component would point to a single file.

In larger file contexts, like VDS data, multiple WPCs may point to the same VDS container or for convenience a service exists that can extract a subset of the data in the container. In that case, a pointer to a complete file does not work (in the RESQML project this will be even more apparent)

There is a desire to point to ‘locations’ in the given storage format and not to a complete file. For this something like a URI can be used.

In streaming data platforms this is quite common and a construct like: spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40

  • Spotify is the scheme name to which a well-known service is associated
  • Track is the object you’re looking for
  • #2 (closed):40 is the location in the track

For VDS something similar could be used: vds:c9465e4a498048f6b8a8ea1e9e984255:trace:8+9+10

  • vds is the scheme name to which the vds service is associated
  • c9465e4a498048f6b8a8ea1e9e984255 is the identifier of the container
  • trace is the data-type
  • 8+9+10 is a (made up) construct to retrieve trace number 8, 9 and 10

File would just be another scheme.

So when SEG-Y is loaded and transcoded into VDS format, the WPC’s can point to the SEG-Y File and the artifacts can point to the VDS container.

For this to work, there needs to be a service that understands the schema/codec. Out of the box services for OSDU can for example cover VDS or WITSML: providing the schema is the trigger for the platform to call the right service (e.g. enter a URI like spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40 on your phone and the platform (your phone) opens a song at 2:40 (radio-head warning).

Decision

URI as identifiers for the type and ‘internal’ location of a data container

Rationale

More flexible

Consequences

Have to change existing data definitions

When to revisit

Edited by Stephen Whitley (Invited Expert)