Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • H Home
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 15
    • Issues 15
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Open Subsurface Data Universe SoftwareOpen Subsurface Data Universe Software
  • Platform
  • Data Flow
  • Data IngestionData Ingestion
  • Home
  • Issues
  • #43
Closed
Open
Issue created Sep 29, 2020 by Stephen Whitley (Invited Expert)@stephenwhitley1 of 6 checklist items completed1/6 checklist items

URI as identifiers for the type and ‘internal’ location of a data container

URI as identifiers for the type and ‘internal’ location of a data container

Status

  • Initiated
  • Proposed
  • Trialing
  • Under review
  • Approved
  • Retired

Context & Scope

In a lot of cases, relatively small files are ingested into OSDU. In a Work Product, typically a work product component would point to a single file.

In larger file contexts, like VDS data, multiple WPCs may point to the same VDS container or for convenience a service exists that can extract a subset of the data in the container. In that case, a pointer to a complete file does not work (in the RESQML project this will be even more apparent)

There is a desire to point to ‘locations’ in the given storage format and not to a complete file. For this something like a URI can be used.

In streaming data platforms this is quite common and a construct like: spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40

  • Spotify is the scheme name to which a well-known service is associated
  • Track is the object you’re looking for
  • #2 (closed):40 is the location in the track

For VDS something similar could be used: vds:c9465e4a498048f6b8a8ea1e9e984255:trace:8+9+10

  • vds is the scheme name to which the vds service is associated
  • c9465e4a498048f6b8a8ea1e9e984255 is the identifier of the container
  • trace is the data-type
  • 8+9+10 is a (made up) construct to retrieve trace number 8, 9 and 10

File would just be another scheme.

So when SEG-Y is loaded and transcoded into VDS format, the WPC’s can point to the SEG-Y File and the artifacts can point to the VDS container.

For this to work, there needs to be a service that understands the schema/codec. Out of the box services for OSDU can for example cover VDS or WITSML: providing the schema is the trigger for the platform to call the right service (e.g. enter a URI like spotify:track:0LTZD4vTsp0EN1wXatc9IR#2:40 on your phone and the platform (your phone) opens a song at 2:40 (radio-head warning).

Decision

URI as identifiers for the type and ‘internal’ location of a data container

Rationale

More flexible

Consequences

Have to change existing data definitions

When to revisit

Edited Sep 29, 2020 by Stephen Whitley (Invited Expert)
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking