Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
ComplianceService.md 22.72 KiB

Compliance Service

Table of Contents

##Introduction This document covers how to remain compliant at the different stages of the data lifecycle inside the Data Ecosystem.

  1. When ingesting data
  2. Whilst the data is inside the Data Ecosystem
  3. When consuming data

The clients' interaction revolves around ingestion and consumption, so this is when you need to use what is contained in this guide. Point 2 should be mostly handled on the clients’ behalf; however, it is still important to understand that this is happening as it has ramifications on when and how data can be consumed.

Data compliance is largely governed through the Records in the storage service. Though there is an independent legal service and LegalTags entity, these offer no compliance by themselves.

Records have a Legal section in their schema and this is where the compliance is enforced. However, clients must still make sure they are using the Record service correctly to remain compliant.

Further details can be found in the Creating a Record section.

##API usage Details of our APIs including how to create and retrieve LegalTags can be found in our Portal documentation here.

You currently need the role users.datalake.viewers to access the LegalTag API. When creating a LegalTag you need at least the users.datalake.editors role. You need the users.datalake.admins role to update legalTags.

The Data Ecosystem stores data in different data partitions depending on the access to those data partitions in the osdu system.

A user may have access to many data partitions in osdu e.g. a OSDU user may have access to both the OSDU data partition and a customers data partition. When a user logs into the osdu portal they choose which data partition they currently want to be active.

When using the LegalTag APIs, you need to specify which data partition they currently have active access to and send it in the OSDU-data-partition-id header.

OSDU-data-partition-id

The correct values can be obtained from CFS services.

We use this value to work out which data partition to use. There is also a special data partition known as common

OSDU-data-partition-id: common

This has all public data in the Data Ecosystem. Users always have access to this as well as their current active data partition.

Currently you can only specify 1 data partition Id value at a time when using the Legal APIs. If you want to retrieve all LegalTags from both the user's data partition and the common data partition, you need to do 2 separate requests, changing the header value used in each.

You can also send a correlation id as a header so that a single request can be tracked throughout all the services it passes through. This can be a GUID on the header with a key

OSDU-Correlation-Id 1e0fef08-22fd-49b1-a5cc-dffa21bc0b70

If you are the service initiating the request, you should generate the id. Otherwise, you should just forward it on in the request.

Back to table of contents

##What is a LegalTag? A LegalTag is the entity that represents the legal status of data in the Data Ecosystem. It is a collection of properties that governs how the data can be consumed and ingested.

A legal tag is required for data ingestion. Therefore, creation of a legal tag is a necessary first step if there isn't a legal tag already exists for use with the ingested data. The LegalTag name is used for reference.

When data is ingested, it is assigned the LegalTag name. This name is checked for a corresponding valid LegalTag in the system. A valid LegalTag means it exists and has not expired. If a LegalTag is invalid, the data is rejected.

For instance, we may not allow ingestion of data from a certain country, or we may not allow consumption of data that has an expired contract.

A name needs to be assigned to the LegalTag during creation. The name is a unique identifier for the LegalTag that is used to access it.

##Ingestion workflow

API Security - High level

The above diagram shows the typical sequence of events of a data ingestion. The important points to highlight are as follow:

  • It is the clients' responsibility to create a LegalTag. LegalTag validation happens at this point.
  • The Storage service validates the LegalTag for the data being ingested.
  • Only after validating a LegalTag exists can we ingest data. No data should be stored at any point in the Data Ecosystem that does not have a valid LegalTag.

##Creating a LegalTag Any data being ingested needs a LegalTag associated with it. You can create a LegalTag by using the POST LegalTag API e.g.

POST /api/legal/v1/legaltags
{
        "name": "demo-legaltag",
        "description": "A legaltag used for demonstration purposes.",
        "properties": {
            "countryOfOrigin":["US"],
            "contractId": "No Contract Related",
            "expirationDate": "2099-01-01",
            "dataType":"Public Domain Data", 
            "originator":"OSDU",
            "securityClassification":"Public",
            "exportClassification":"EAR99",
            "personalData":"No Personal Data"
        }
}
Curl
curl --request POST \
  --url 'https://api.osdu.[osdu].org/de/legal/v1/legaltags' \
  --header 'accept: application/json' \
  --header 'authorization: Bearer <JWT>' \
  --header 'content-type: application/json' \
  --header 'OSDU-data-partition-id: common' \
  --data '{
        "name": "demo-legaltag",
        "description": "A legaltag used for demonstration purposes.",
        "properties": {
            "countryOfOrigin":["US"],
            "contractId":"No Contract Related",
            "expirationDate":"2099-01-01",
            "dataType":"Public Domain Data", 
            "originator":"OSDU",
            "securityClassification":"Public",
            "exportClassification":"EAR99",
            "personalData":"No Personal Data"
        }
}'

It is good practice for LegalTag names to be clear and descriptive of the properties it represents, so it would be easy to discover and to associate to the correct data with it. Also, the description field is a free form optional field to allow for you to add context to the LegalTag, making easier to understand and retrieve over time.

When creating LegalTags, the name is automatically prefixed with the data partition Id that is sent in the request. So in the example above, if the given OSDU-data-partition-id header value is common, then the actual name of the LegalTag would be common-demo-legaltag.

To help with LegalTag creation, it is advised to use the Get LegalTag Properties API to obtain the allowed properties before creating a legal tag. This returns the allowed values for many of the LegalTag properties.

##LegalTag properties Below are details of the properties you can supply when creating a LegalTag along with the values you can use. The allowed properties values can be data partition specific. Valid values associated with the property are shown. All values are mandatory unless otherwise stated.