Commit 75a7c599 authored by neelesh thakur's avatar neelesh thakur Committed by ethiraj krishnamanaidu
Browse files

add tutorials and openapi spec

parent 281f59f9
This diff is collapsed.
## Action Service
## Table of Contents <a name="TOC"></a>
- [Introduction](#introduction)
- [Action APIs](#action-apis)
* [Registering an Action](#register-action)
* [Get an Action by ID](#get-action)
* [Retrieve Actions](#retrieve-action)
* [Delete an Action by ID](#delete-action)
* [Validate action](#regex-test)
- [Current Limitations](#limitation)
## Introduction <a name="introduction"></a>
The high level design of this service can conceptually be thought of similar to the 'command' design pattern. Essentially this pattern decouples a trigger from an action. This is often used in UIs where a trigger is perhaps a user clicking a button and the action is a program function that is triggered by the click. There is often an optional context which can provide the action with data to use in the function, as well as to enable / disable the action for the user (perhaps if the data is not relevant to the action in question).
This service will allow an application to register an action (the function to be triggered). It will expect data (context) to come from OSDU to enable the action, and the application can register a filter (enable/disable) to say what data can be used with this action.
[Back to Table of Contents](#TOC)
## Action APIs <a name="action-apis"></a>
### Registering an Action <a name="register-action"></a>
This API allows registering an action in the form of a GET HTTPS URL and a filter. The filter specifies what data the action can be applied to.
> It is recommended that Admins first use the [Validate action](#regex-test) API, to make sure the action is acceptable and the output of the action with a test payload is as expected.
```
POST /api/register/v1/action
```
<details><summary>curl</summary>
```
curl --request POST \
--url 'https://register-svc.osdu.com/api/register/v1/action' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--data '{
"id": "petrel-launch-project",
"name": "Petrel Project",
"description": "This action launches the Petrel projects landing page that holds the selected data.",
"url": "https://myapp.osdu.com/action/{id}/{data.project}",
"img": "https://mycdn.com/img.png",
"contactEmail": "abc@test.com",
"filter": {
"entityType": ["regularheightfield", "project"],
"source": ["petrel"],
"version": ["*"]
}
}'
```
</details>
The filter specifies what data the action can be applied to. Each property in the filter is either values representing an exact match for the data it can handle or a single wildcard '*' indicating that any data can match that property filter.
The URL given on the registration must be a fully qualified HTTPS GET request. The URL can support templates as shown above e.g. {data.project}, as well as regular expressions. These templates can be applied anywhere in the given URL (domain, path, query etc.) The template values can be any property that matches a Record's payload.
The filter specifies which Records your action can be used with. The retrieve API applies the template onto a given Record to create the fully qualified URL. For instance, the above URL could be applied to the following Record in Storage.
```
{
"id": "common:doc:123456789",
"kind": "common:petrel:regularheightfield:1.0.0",
...
"data": {
"project":"myPetrelProj"
}
...
}
```
Because the filter of the action matches the Record (petrel and regularheightfield match the 'kind' and the version is a wildcard so matches 1.0.0), the resulting action after the template is applied would then be
```
https://myapp.osdu.com/action/common:doc:123456789/myPetrelProj
```
[Back to Table of Contents](#TOC)
## Get an Action by ID <a name="get-action"></a>
This API allows getting an action with a given Id.
```
GET /api/register/v1/action/{id}
```
<details><summary>curl</summary>
```
curl --request GET \
--url 'https://register-svc.osdu.com/api/register/v1/action/petrel-launch-project' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common'
```
</details>
[Back to Table of Contents](#TOC)
## Retrieve Actions <a name="retrieve-action"></a>
This API allows retrieving all actions that match a given filter.
```
POST /api/register/v1/action:retrieve
```
So imagine you have a Record retrieved from OSDU:
```
{
"id": "common:regularheightfield:123456",
"kind": "common:petrel:regularheightfield:1.0.0",
"acl": {
"viewers": ["data.default.viewers@common.osdu.com"],
"owners": ["data.default.owners@common.osdu.com"]
},
"legal": {
"legaltags": ["common-sample-legaltag"],
"otherRelevantDataCountries": ["FR","US","CA"]
},
"data": {
"msg": "Hello"
}
}
```
And make the following call to retrieve actions API:
<details><summary>curl</summary>
```
curl --request POST\
--url 'https://register-svc.osdu.com/api/register/v1/action:retrieve' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common'
--data '{
"id": "common:regularheightfield:123456",
"kind": "common:petrel:regularheightfield:1.0.0",
"acl": {
"viewers": ["data.default.viewers@common.osdu.com"],
"owners": ["data.default.owners@common.osdu.com"]
},
"legal": {
"legaltags": ["common-sample-legaltag"],
"otherRelevantDataCountries": ["FR","US","CA"]
},
"data": {
"msg": "Hello"
}
}'
```
</details>
This will then find all actions whose filter matches your Record. It will then attempt to substitute any template value and also will evaluate the regular expression from the Record into the action.
Given two matching actions that had a "url" field with these templates
```
"url": "https://myapp.osdu.com/action/{id}",
and
"url": "https://myapp.osdu.com/action?text={data.msg}",
```
Then the response returns all matching actions with the substituted parameters specified:
```
[
{
"id": "123-456-abc",
"name": "Petrel",
"description": "Opens the given objects project in Petrel PTS",
"img": "https://mycdn.com/myimg.png",
"contactEmail": "abc@test.com",
"url": "https://myapp.osdu.com/action/common:regularheightfield:123456"
},
{
"id": "923-456-abc",
"name": "myApp2",
"description": "Does something awesome",
"img": "https://mycdn.com/myimg2.png",
"contactEmail": "abc@test.com",
"url": "https://myapp.osdu.com/action?text=Hello"
}
]
```
[Back to Table of Contents](#TOC)
## Delete an Action by ID <a name="delete-action"></a>
This API allows deleting an action with a given id.
```
DELETE /api/register/v1/action/{id}
```
<details><summary>curl</summary>
```
curl --request DELETE \
--url 'https://register-svc.osdu.com/api/register/v1/action/petrel-launch-project' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common'
```
</details>
[Back to Table of Contents](#TOC)
## Validate action <a name="regex-test"></a>
This API is a helper API method that allows users to validate their action is working as expected, including template and regular expression usage, before they create an action in the system.
```
POST /api/register/v1/action:test
```
Let's consider the following payload for [Register an Action](#register-action) request example
<details><summary>curl</summary>
```
curl --request POST \
--url 'https://register-svc.osdu.com/api/register/v1/action' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--data '{
"id": "petrel-launch-project",
"name": "Petrel Project",
"description": "This action launches the Petrel projects landing page that holds the selected data.",
"url": "https://myapp.osdu.com/action/{data.uri:^(?:[^\/]*(?:\/(?:\/[^\/]*\/?)?)?([^?]+)(?:\??.+)?)$}",
"img": "https://mycdn.com/img.png",
"contactEmail": "abc@test.com",
"filter": {
"entityType": ["regularheightfield", "project"],
"source": ["petrel"],
"version": ["*"]
}
}'
```
</details>
The above action applies a regular expression of
```
^(?:[^\/]*(?:\/(?:\/[^\/]*\/?)?)?([^?]+)(?:\??.+)?)$
```
onto the data.uri properties value.
This regular expression attempts to extract the path segment out of a URI. But before we register this action, we want to be sure that the specified regular expression is correct and return the expected value from expected payload for a kind. This can be achieved by following API call:
<details><summary>curl</summary>
```
curl --request POST\
--url 'https://register-svc.osdu.com/api/register/v1/action:test' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common'
--data '{
"action": {
"id": "petrel-launch-project",
"name": "Petrel Project",
"description": "This action launches the Petrel projects landing page that holds the selected data.",
"url": "https://myapp.osdu.com/action/{data.uri:^(?:[^\\/]*(?:\\/(?:\\/[^\\/]*\\/?)?)?([^?]+)(?:\\??.+)?)$}",
"img": "https://mycdn.com/img.png",
"contactEmail": "abc@test.com",
"filter": {
"entityType": ["regularheightfield", "project"],
"source": ["petrel"],
"version": ["*"]
}
},
"testPayload": {
"id": "common:regularheightfield:123456",
"kind": "common:petrel:regularheightfield:1.0.0",
"data": {
"uri": "https://myproj.com/abc123"
}
}
}'
```
</details>
And in this case the Response would be:
```
{
"url": "https://myapp.osdu.com/action/abc123",
"errors": ""
}
```
In the above example, the regular expression was valid and we see the response append the path of the data.uri property into the action and there are no errors.
However an error is returned if:
- The filter did not match the testPayload
- The regular expression was invalid
- The regular expression failed to extract a value from the test payload.
There is also the possibility that the regular expression extracts a value, but not the one you expected. In this scenario, the API does not return an error, so you need to validate the returned url is formed as you expected after it is mapped into the test payload.
Here are some regular expression registration examples:
| Example regular expression Registration | Example Record | Example Output|
|:---------------------------|:---------------|:--------------|
|`https://myapp.osdu.com/action/{data.uri:^(?:[^\\/]*(?:\\/(?:\\/[^\\/]*\\/?)?)?([^?]+)(?:\\??.+)?)$}`|`"data": {"uri": "https://myproj.com/abc123"}`| `https://myapp.osdu.com/action/abc123`|
|`https://myapp.osdu.com/action?type={kind:^(?:[A-Za-z]+\\:)*([A-Za-z]+)\\:(?:.+)$`|`"kind": "common:petrel:regularheightfield:1.0.0"`| `https://myapp.osdu.com/action?type=regularheightfield`|
|`https://myapp.osdu.com/action/{kind:^(?:[A-Za-z]+\\:)*(.+)}`|`"data": {"kind": "common:petrel:regularheightfield:1.0.0"}`| `https://myapp.osdu.com/action/1.0.0`|
|`https://myapp.osdu.com/action/{id}?type={kind:^(?:[^:]*:){2}([^:]*)}`|`data": {"id": "test-id", kind": "common:petrel:regularheightfield:1.0.0"}`| `https://myapp.osdu.com/action/test-id?type=regularheightfield`|
The regular expression match per record field is capped at a maximum of 2 seconds for performance reasons. Please take a look at the error message for detailed response.
[Back to Table of Contents](#TOC)
## Current Limitations <a name="limitation"></a>
There are mainly 2 limitations currently:
- Users need to individually register for each data partition they want the action enabled from.
- We don't support a wildcard in the action URL for the partition id. Something like myaction.com/kind?dpid={data-partition-id} may be supported in the future so that the request can be data partition aware.
[Back to Table of Contents](#TOC)
\ No newline at end of file
## How to become a DDMS
* [Introduction](#introduction)
* [Register as a DDMS](#register)
* [Create a Storage schema](#create-schema)
* [Expose legal tag and ACLs through your APIs](#expose-legal)
* [Optional: Expose compliance for derivative data](#expose-derivative)
* [Ingest data and create a shadow record](#shadow-record)
* [Perform compliance and ACL checks using shadow records ](#validate)
* [Client retrieves the bulk data](#retrieve)
## Introduction <a name="introduction"></a>
A Domain Data Management Service (DDMS) can be seen as any source of truth for data that manages the data life cycle, satisfies given mandatory data access concerns, and makes its data globally discoverable and retrievable through the Data Ecosystem. It could be a standalone service dedicated to a specific data type or a subcomponent of an application or platform. It simply enables its data to be retrieved outside of its regular scope.
A DDMS needs to enforce the common concerns of
* Legal compliance
* Data access authorization
* Discovery
* Retrieval of the data based on discovery
Data Ecosystem solves these concerns primarily using Storage records. A Storage record is metadata pertaining to the bulk data stored in the DDMS. Every record created in Storage enforces that ACLs are assigned, checks compliance and then indexes the record into search, making it discoverable.
The following is the preferred method of using Records to enable these concerns for a DDMS.
## Register as a DDMS <a name="register"></a>
The first step is to register as a DDMS. This makes your DDMS discoverable to clients and presents them with an API definition that tells them how to retrieve the bulk data when a record from their DDMS is discovered.
The only API that needs to be defined is the one that tells them how to retrieve the bulk data based on an Id.
Note that you can register as much of your API specification as you like. You only need to define the method clients should use to retrieve the bulk data using the custom property x-ddms-retrieve-entity: true.
<details><summary>curl</summary>
```
curl --request POST \
--url '/api/register/v1/ddms' \
--header 'accept: application/json' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--data '{
{
"id": "{DDMS-ID}",
"name": "logDDMS",
"description": "My test ddms.",
"contactEmail": "test@test.com",
"interfaces": [
{
"entityType": "wellbore",
"schema": {
"openapi": "3.0.0",
"info": {
"description": "This is a sample Wellbore domain DM service.",
"version": "1.0.0",
"title": "OSDU Wellbore Domain DM Service",
"contact": {
"email": "osdu-sre@opengroup.org"
}
},
"servers": [
{
"url": "https://subsurface.data.osdu.com/v1"
}
],
"tags": [
{
"name": "wellbore",
"description": "Wellbore data type services"
}
],
"paths": {
"/wellbore/{wellboreId}": {
"get": {
"tags": [
"wellbore"
],
"summary": "Find wellbore by ID",
"description": "Returns a single wellbore",
"operationId": "getWellboreById",
"x-ddms-retrieve-entity": true,
"parameters": [
{
"name": "wellboreId",
"in": "path",
"description": "ID of wellbore to return",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"description": "successful operation",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/wellbore"
}
}
}
},
"400": {
"description": "Invalid ID supplied"
},
"401": {
"description": "Not authorized"
},
"404": {
"description": "Wellbore not found"
}
}
}
}
}
```
</details>
## Create a Storage schema <a name="create-schema"></a>
It is up to the bulk data store to determine what properties of the bulk data they want to push into a Storage record and to make discoverable within the Data Ecosystem (DE).
They define a storage schema to represent this. The schema is a list of properties and the type of data they represent that will be on the Record.
When deploying your service, you should do the one-time operation of publishing the schema via the Storage APIs.
<details><summary>curl</summary>
```
curl --request POST \
--url '/api/storage/v2/schemas' \
--header 'accept: application/json' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--data '{
"kind": "common:welldb:wellbore:1.0.0",
"schema": [
{
"path": "name",
"kind": "string"
},
{
"path": "ddmsId",
"kind": "string"
},
{
"path": "localId",
"kind": "string"
},
{
"path": "entityType",
"kind": "string"
}]
}'
```
</details>
This will then allow any Record that references this schema to be indexed in the DE search. Without this, the Record will be published but without any of the data and it will be hidden by default in search.
Notice, we are also declaring 3 properties:
**ddmsId** is the id used when you register as a DDMS.
**entityType** is the domain object type the data represents, e.g. ‘seismic’, ‘well’.
**localId** is the id of the bulk data as it is referenced in your DDMS. The end user should be able to use this id to retrieve the bulk data from your APIs.
These act as well-known properties that should be added to the record by your DDMS. Clients can then use this information to retrieve the bulk data after discovery using the DDMS registration APIs. Every schema created should declare these properties to use this pattern of ingestion.
## Expose legal tag and ACLs through your APIs <a name="expose-legal"></a>
Unless you have a scenario where you know what legal tag and ACL should be applied to the data you are ingesting, you will need to expose the legal tag and ACL in your ingestion APIs. This allows your clients to supply the legal tag and ACL themselves.
You can expose the same interface as the Storage records API, allowing you to assign them to the record you create.
<details>
```
"acl": {
"viewers": ['data.default.viewers@{datapartition}.{domain}.com'],
"owners": ['data.default.owners@{datapartition}.{domain}.com']
},
"legal": {
"legaltags": ['common-sample-legaltag']
```
</details>
## Optionally expose derivative compliance through your APIs <a name="expose-derivative"></a>
If you expect derivative data to be stored in your DDMS, you need to expose 2 more properties through your APIs that can be appended to your Storage record.
Again, you can expose the same interface as the Storage records API, allowing you to assign them to the record directly. The 2 properties are:
* otherRelevantDataCountries: The alpha 2 country code of the country where the derivative was created or calculated
* parents: The record ids and versions of the Records this derivative was created from
If a derivative is being created then a legal tag does not need to be assigned as it inherits this from its parents.
<details><summary>curl</summary>
```
"legal" :{
"otherRelevantDataCountries": ["US"]
},
"ancestry" :{
"parents": ["common:id:1:version", "common:id:2:version"]
}
```
</details>
## Ingest data and create a shadow record <a name="shadow-record"></a>
Whenever bulk data is ingested, you need to create a shadow record within Storage. This shadow record represents the specific bulk data instance in a 1:1 relationship and makes each instance globally discoverable.
When you create the shadow record using the Storage API, forward on the original callers jwt token.
First, you should store the bulk data, and then create the shadow Record. This way, a global piece of data is not discoverable before the bulk data is available. If this is not successful, e.g. because an invalid legal tag is provided, the request will fail and you should return this response to the client and attempt to clean up the bulk data.
Remember, you should append your DDMS Id, entityType and the bulk data’s local id to the Storage record.
<details><summary>curl</summary>
```
curl --request PUT \
--url '/api/storage/v2/records' \
--header 'accept: application/json' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--data '[
{
"kind": "common:welldb:wellbore:1.0.0",
"acl": {
"viewers": ['data.default.viewers@{datapartition}.{domain}.com'],
"owners": ['data.default.owners@{datapartition}.{domain}.com']
},
"legal": {
"legaltags": ['common-sample-legaltag'],
"otherRelevantDataCountries": ["FR”]
},
"data": {
"name": "well1",
"entityType": wellbore,
"ddmsId": "abcdef",
"localId": "123456"
}]'
```
</details>
## Perform compliance and ACL checks using shadow records <a name="validate"></a>
As mentioned, a DDMS should create a shadow record for every instance of bulk data ingested into their data store. This can have advantages beyond global discover-ability. Whenever you request a storage record, both compliance and entitlements are checked before returning the data. A DDMS can use this to their advantage.
By forwarding on any request by the client to retrieve the record, you can delegate these responsibilities to the Storage service. If Data Ecosystem returns the Record, the client can access both this and the bulk data, and so you can return the same to the client or only the Record.
<details><summary>curl</summary>
```
curl --request POST \
--url '/api/storage/v2/query/records:batch' \
--header 'authorization: Bearer <JWT>' \
--header 'content-type: application/json' \
--header 'data-partition-id: common' \
--header 'frame-of-reference: NONE' \
--data '{
"records": [
"common:test:fetchtest-1",
"common:test:fetchtest-2",
"common:test:fetchtest-4",
"common:test:fetchtest-5",
"common:test:fetchtest-6"
]
}
```
</details>
In this scenario, you also don’t need to store the ACL or legal tag information in your DDMS because those are being retrieved directly from the Data Ecosystem in this request. However, you need to either store or be able to generate the Storage record ID needed to retrieve the record for the bulk data requested.
## Client retrieves the bulk data <a name="retrieve"></a>
Imagine the client discovered a record with the following data
<details><summary>curl</summary>
```
"data": {
"name": "well1",
"entityType": wellbore,
"ddmsId": "abcdef",
"localId": "123456"
}
```
</details>
They can use the ddmsId property of the data object to retrieve the API definition of the DDMS you registered at the start.
<details><summary>curl</summary>
```
curl --request GET \
--url '/api/register/v1/ddms/abcdef' \
--header 'authorization: Bearer <JWT>' \