-
Rostislav Dublin (EPAM) authoredRostislav Dublin (EPAM) authored
- Register Service
- Getting Started
- Features of implementation
- Limitations of the current version
- Extensibility
- Settings and Configuration
- Requirements
- Mandatory
- for Google Cloud only
- General Tips
- Mapper tuning mechanisms
- for universal technologies:
- Their algorithms are as follows:
- for native Google Cloud technologies:
- Their algorithms are similar,
- Configuration
- Service Configuration
- Common properties for all environments
- For Mappers, to activate drivers
- For Google Cloud only
- Configuring mappers' Datasources
- for OSM - Postgres:
- for OQM - RabbitMQ:
- Interaction with message brokers
- Specifics of work through PULL subscription
- Run and test the service
- Running Locally
- Testing
- Test the application
- Running E2E Tests
- Deployment
- Cloud KMS Setup
- License
Register Service
os-register-gcp is a Spring Boot service that hosts CRUD APIs that allows consumers to register a push endpoint that can be triggered when data change events happen within the OSDU R2 ecosystem.
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Features of implementation
This is a universal solution created using EPAM OSM and OQM mappers technology. It allows you to work with various implementations of data stores and message brokers.
Limitations of the current version
In the current version, the mappers are equipped with several drivers to the stores and the message broker:
- OSM (mapper for KV-data): Google Datastore; Postgres
- OQM (mapper to message brokers): Google PubSub; RabbitMQ
Extensibility
To use any other store or message broker, implement a driver for it. With an extensible set of drivers, the solution is unrestrictedly universal and portable without modification to the main code.
Mappers support "multitenancy" with flexibility in how it is implemented. They switch between datasources of different tenants due to the work of a bunch of classes that implement the following interfaces:
- Destination - takes a description of the current context, e.g., "data-partition-id = opendes"
- DestinationResolver – accepts Destination, finds the resource, connects, and returns Resolution
- DestinationResolution – contains a ready-made connection, the mapper uses it to get to data
Settings and Configuration
Requirements
Mandatory
- Java 8
- Maven 3.6.0+
for Google Cloud only
- GCloud command line tool
- GCloud access to opendes project
General Tips
Environment Variable Management The following tools make environment variable configuration simpler
- direnv - for a shell/terminal environment
- EnvFile - for Intellij IDEA
Lombok This project uses Lombok for code generation. You may need to configure your IDE to take advantage of this tool.
Mapper tuning mechanisms
This service uses specific implementations of DestinationResolvers based on the tenant information provided by the OSDU Partition service. A total of 4 resolvers are implemented, which are divided into two groups:
for universal technologies:
- for Postgres: mappers/osm/PgTenantOsmDestinationResolver.java
- for RabbitMQ: mappers/oqm/MqTenantOqmDestinationResolver.java
Their algorithms are as follows:
- incoming Destination carries data-partition-id
- resolver accesses the Partition service and gets PartitionInfo
- from PartitionInfo resolver retrieves properties for the connection: URL, username, password etc.
- resolver creates a data source, connects to the resource, remembers the datasource
- resolver gives the datasource to the mapper in the Resolution object
for native Google Cloud technologies:
- for Datastore: mappers/osm/DsTenantOsmDestinationResolver.java
- for PubSub: mappers/oqm/PsTenantOqmDestinationResolver.java
Their algorithms are similar,
Except that they do not receive special properties from the Partition service for connection, because the location of the resources is unambiguously known - they are in the GCP project. And credentials are also not needed - access to data is made on behalf of the Google Identity SA under which the service itself is launched. Therefore, resolver takes only the value of the projectId property from PartitionInfo and uses it to connect to a resource in the corresponding GCP project.
Configuration
Service Configuration
Define the following environment variables. Most of them are common to all hosting environments, but there are properties that are only necessary when running in Google Cloud.
Common properties for all environments
name | value | description | sensitive? | source |
---|---|---|---|---|
LOG_PREFIX |
service |
Logging prefix | no | - |
SERVER_SERVLET_CONTEXPATH |
/api/register/v1 |
Register context path | no | - |
ENTITLEMENTS_API |
ex https://entitlements.com/entitlements/v1
|
Entitlements API endpoint | no | output of infrastructure deployment |
STORAGE_API |
ex https://os-storage-dot-opendes.appspot.com/api/storage/v2
|
Storage API endpoint | no | output of infrastructure deployment |
RECORDS_CHANGE_PUBSUB_ENDPOINT |
ex https://os-notification-dot-opendes.appspot.com/api/notification/v1/push-handlers/records-changed
|
Notification API endpoint 'records-changed' | no | output of infrastructure deployment |
SUBSCRIBER_SECRET |
ex7a786376626e
|
HMAC_SECRET from notification int tests in HEX , pattern(^[a-zA-Z0-9]{8,30}+$) | yes | output of infrastructure deployment |
SUBSCRIBER_PRIVATE_KEY_ID |
******** |
Private key id of DE_OPS_TESTER from notification int tests | yes | output of infrastructure deployment |
ENVIRONMENT |
ex dev
|
Service environment config | no | - |
PARTITION_API |
ex http://localhost:8081/api/partition/v1
|
Partition service endpoint | no | - |
For Mappers, to activate drivers
name | value | description |
---|---|---|
OSMDRIVER | datastore | to activate OSM driver for Google Datastore |
OSMDRIVER | postgres | to activate OSM driver for PostgreSQL |
OQMDRIVER | pubsub | to activate OQM driver for Google PubSub |
OQMDRIVER | rabbitmq | to activate OQM driver for Rabbit MQ |
For Google Cloud only
name | value | description | sensitive? | source |
---|---|---|---|---|
GOOGLE_CLOUD_PROJECT |
ex opendes
|
Google Cloud Project Id | no | output of infrastructure deployment |
GCLOUD_REGION |
ex us-central1
|
cloud region | no | - |
SERVICE_IDENTITY |
ex osdu-gcp-sa
|
Service account identity "osdu-gcp-sa@iam.gserviceaccount.com" | yes | https://console.cloud.google.com/apis/credentials |
INTEGRATION_TEST_AUDIENCES |
ex *****.apps.googleusercontent.com
|
Client ID for getting access to cloud resources | yes | https://console.cloud.google.com/apis/credentials |
GOOGLE_AUDIENCES |
ex *****.apps.googleusercontent.com
|
Client ID for getting access to cloud resources | yes | https://console.cloud.google.com/apis/credentials |
System Environment required to run service
name | value | description | sensitive? | source |
---|---|---|---|---|
SPRING_PROFILES_ACTIVE |
local |
spring active profile | no |
Configuring mappers' Datasources
When using non-Google-Cloud-native technologies, property sets must be defined on the Partition service as part of PartitionInfo for each Tenant.
They are specific to each storage technology:
for OSM - Postgres:
database structure OSM works with data logically organized as "partition"->"namespace"->"kind"->"record"->"columns". The above sequence describes how it is named in Google Datastore, where "partition" maps to "GCP project".
For example, this is how Datastore OSM driver contains records for "SUBSCRIPTION" data register:
hierarchy level | value |
---|---|
partition (opendes) | osdu-cicd-epam |
namespace | DE |
kind | SUBSCRIPTION |
record | <multiple kind records> |
columns | acl; bucket; kind; legal; etc... |
And this is how Postges OSM driver does. Notice, the above hierarchy is kept, but Postgres uses alternative entities for it.
Datastore hierarchy level | Postgres alternative used | |
---|---|---|
partition (GCP project) | == | Postgres server URL |
namespace | == | Schema |
kind | == | Table |
record | == | '' |
columns | == | id, data (jsonb) |
As we can see in the above table, Postgres uses different approach in storing business data in records. Not like Datastore, which segments data into multiple physical columns, Postgres organises them into the single JSONB "data" column. It allows provisioning new data registers easily not taking care about specifics of certain registers structure. In the current OSM version (as on December'21) the Postgres OSM driver is not able to create new tables in runtime.
So this is a responsibility of DevOps / CICD to provision all required SQL tables (for all required data kinds) when on new
environment or tenant provisioning when using Postgres. Detailed instructions (with examples) for creating new tables is
in the OSM module Postgres driver README.md org/opengroup/osdu/core/gcp/osm/translate/postgresql/README.md
As a quick shortcut, this example snippet can be used by DevOps DBA:
--CREATE SCHEMA "exampleschema";
CREATE TABLE exampleschema."ExampleKind"(
id text COLLATE pg_catalog."default" NOT NULL,
pk bigint NOT NULL GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
data jsonb NOT NULL,
CONSTRAINT ExampleKind_id UNIQUE (id)
);
CREATE INDEX ExampleKind_datagin ON exampleschema."ExampleKind" USING GIN (data);
prefix: osm.postgres
It can be overridden by:
- through the Spring Boot property
osm.postgres.partitionPropertiesPrefix
- environment variable
OSM_POSTGRES_PARTITIONPROPERTIESPREFIX
Propertyset:
Property | Description |
---|---|
osm.postgres.datasource.url | server URL |
osm.postgres.datasource.username | username |
osm.postgres.datasource.password | password |
Example of a definition for a single tenant
curl -L -X PATCH 'https://dev.osdu.club/api/partition/v1/partitions/opendes' -H 'data-partition-id: opendes' -H 'Authorization: Bearer ...' -H 'Content-Type: application/json' --data-raw '{
"properties": {
"osm.postgres.datasource.url": {
"sensitive": false,
"value": "jdbc:postgresql://35.239.205.90:5432/postgres"
},
"osm.postgres.datasource.username": {
"sensitive": false,
"value": "osm_poc"
},
"osm.postgres.datasource.password": {
"sensitive": true,
"value": "osm_poc"
}
}
}'
for OQM - RabbitMQ:
prefix: oqm.rabbitmq
It can be overridden by:
- through the Spring Boot property
oqm.rabbitmq.partitionPropertiesPrefix
- environment variable
OQM_RABBITMQ_PARTITIONPROPERTIESPREFIX
Propertyset (for two types of connection: messaging and admin operations):
Property | Description |
---|---|
oqm.rabbitmq.amqp.host | messaging hostnameorIP |
oqm.rabbitmq.amqp.port | - port |
oqm.rabbitmq.amqp.path | - path |
oqm.rabbitmq.amqp.username | - username |
oqm.rabbitmq.amqp.password | - password |
oqm.rabbitmq.admin.schema | admin host schema |
oqm.rabbitmq.admin.host | - host name |
oqm.rabbitmq.admin.port | - port |
oqm.rabbitmq.admin.path | - path |
oqm.rabbitmq.admin.username | - username |
oqm.rabbitmq.admin.password | - password |
Example of a single tenant definition
curl -L -X PATCH 'https://dev.osdu.club/api/partition/v1/partitions/opendes' -H 'data-partition-id: opendes' -H 'Authorization: Bearer ...' -H 'Content-Type: application/json' --data-raw '{
"properties": {
"oqm.rabbitmq.amqp.host": {
"sensitive": false,
"value": "localhost"
},
"oqm.rabbitmq.amqp.port": {
"sensitive": false,
"value": "5672"
},
"oqm.rabbitmq.amqp.path": {
"sensitive": false,
"value": ""
},
"oqm.rabbitmq.amqp.username": {
"sensitive": false,
"value": "guest"
},
"oqm.rabbitmq.amqp.password": {
"sensitive": true,
"value": "guest"
},
"oqm.rabbitmq.admin.schema": {
"sensitive": false,
"value": "http"
},
"oqm.rabbitmq.admin.host": {
"sensitive": false,
"value": "localhost"
},
"oqm.rabbitmq.admin.port": {
"sensitive": false,
"value": "9002"
},
"oqm.rabbitmq.admin.path": {
"sensitive": false,
"value": "/api"
},
"oqm.rabbitmq.admin.username": {
"sensitive": false,
"value": "guest"
},
"oqm.rabbitmq.admin.password": {
"sensitive": true,
"value": "guest"
}
}
}'
Interaction with message brokers
Specifics of work through PULL subscription
To receive messages from brokers, this solution uses the PULL-subscriber mechanism in the Notification service. This is its cardinal difference from other implementations that use PUSH-subscribers (webhooks). This opens a wide choice when choosing brokers.
When using PULL-subscribers, there is a need to restore Notification service subscribers for each Subscription at the start of Notification service, as well as in the runtime, upon registration of a new Subscription by the Register service.
To do this, a special "command" topic is involved:
- the default topic name is
register-subscriber-control
.
If necessary, the name of the topic can be overridden through:
- Spring Boot property
oqm.registerSubscriberControlTopicName
- environment variable
OQM_REGISTERSUBSCRIBERCONTROLTOPICNAME
A topic is created, in its absence, when any of Register or Notification services starts.
See in Notification service repository in the provider/notification-gcp/README.md
file for information
on how the Notification service at its start restores its Subscribers to all already registered Subscriptions,
and also how it listens to the "command" topic and adds its Subscribers upon registration of new Subscriptions by the Registerservice.
Run and test the service
Running Locally
Check that maven is installed:
$ mvn --version
Apache Maven 3.6.0
Maven home: /usr/share/maven
Java version: 1.8.0_212, vendor: AdoptOpenJDK, runtime: /usr/lib/jvm/jdk8u212-b04/jre
...
You will need to configure access to the remote maven repository that holds the OSDU dependencies. This file should live within ~/.m2/settings.xml
:
$ cat ~/.m2/settings.xml
<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<servers>
<server>
<id>os-core</id>
<username>slb-des-ext-collaboration</username>
<!-- Treat this auth token like a password. Do not share it with anyone, including Microsoft support. -->
<password>${VSTS_FEED_TOKEN}</password>
</server>
</servers>
</settings>
- Update the Google cloud SDK to the latest version:
gcloud components update
- Set Google Project Id:
gcloud config set project <YOUR-PROJECT-ID>
- Perform a basic authentication in the selected project:
gcloud auth application-default login
- Navigate to register service's root folder and run:
mvn jetty:run
## Testing
* Navigate to register service's root folder and run:
```bash
mvn clean install
-
If you wish to see the coverage report then go to testing/target/site/jacoco-aggregate and open index.html
-
If you wish to build the project without running tests
mvn clean install -DskipTests
After configuring your environment as specified above, you can follow these steps to build and run the application. These steps should be invoked from the repository root.
cd provider/register-gcp/ && mvn spring-boot:run -Dspring-boot.run.profiles=local
Testing
Navigate to register service's root folder and run all the tests:
# build + test + install core service code
$ (cd register-core/ && mvn clean install)
Test the application
After the service has started it should be accessible via a web browser by visiting http://localhost:8080/api/register/v1/swagger-ui.html. If the request does not fail, you can then run the integration tests.
Running E2E Tests
This section describes how to run cloud OSDU E2E tests (testing/register-test-gcp).
You will need to have the following environment variables defined.
name | value | description | sensitive? | source |
---|---|---|---|---|
DE_OPS_TESTER |
******** |
A base64 encoded google service account json credentials with ops level authorization for OSDU services | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
DE_ADMIN_TESTER |
******** |
A base64 encoded google service account json credentials with admin level authorization for OSDU services | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
DE_EDITOR_TESTER |
******** |
A base64 encoded google service account json credentials with editor level authorization for OSDU services | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
DE_NO_ACCESS_TESTER |
******** |
A base64 encoded google service account json credentials with no authorization for OSDU services | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
REGISTER_BASE_URL |
ex https://os-register-dot-opendes.appspot.com/
|
Register API endpoint | no | output of infrastructure deployment |
ENVIRONMENT |
ex local OR dev
|
'local' for local testing or 'dev' for dev testing | no | - |
SUBSCRIBER_SECRET |
ex ********
|
String in hex , must match pattern ^[a-zA-Z0-9]{8,30}+$ & be in register variable SUBSCRIBER_SECRET | yes | - |
INTEGRATION_TEST_AUDIENCE |
ex *****.apps.googleusercontent.com
|
Client ID for getting access to cloud resources | yes | https://console.cloud.google.com/apis/credentials |
CLIENT_TENANT |
ex nonexistenttenant
|
Client tenant, it is supposed to be tenant that we do not have access to, it can be not existing tenant | no | - |
OSDU_TENANT |
ex osdu
|
Osdu tenant | no | - |
SUBSCRIPTION_ID |
******** |
A base64 encoded string of subscribed topic + subscriber url ex records-changedhttp://localhost:8081/api/register/v1/test/challenge/1
|
no | - |
REGISTER_CUSTOM_PUSH_URL |
exhttps://os-register-dot-opendes.appspot.com/api/register/v1/test/challenge/1
|
Register push url, that will act as subscriber | no | - |
Entitlements configuration for integration accounts
DE_OPS_TESTER | DE_ADMIN_TESTER | DE_EDITOR_TESTER | DE_NO_ACCESS_TESTER |
---|---|---|---|
service.entitlements.user users.datalake.ops data.test1 data.integration.test users@{tenant1}@{domain}.com |
service.entitlements.user users.datalake.admins data.test1 data.integration.test users@{tenant1}@{domain}.com |
service.entitlements.user users.datalake.editors data.test1 data.integration.test users@{tenant1}@{domain}.com |
service.entitlements.user data.test1 data.integration.test users@{tenant1}@{domain}.com |
Execute following command to build code and run all the integration tests:
# Note: this assumes that the environment variables for integration tests as outlined
# above are already exported in your environment.
$ (cd testing/register-test-core/ && mvn clean test)
$ (cd testing/register-test-gcp/ && mvn clean test)
Deployment
- Data-Lake Register Google Cloud Endpoints on App Engine Flex environment
- Edit the app.yaml
- Open the app.yaml file in editor, and replace the YOUR-PROJECT-ID
PROJECT
line with Google Cloud Platform project Id. Also updateINTEGRATION_TEST_AUDIENCE
,SUBSCRIBER_PRIVATE_KEY_ID
,SPRING_PROFILES_ACTIVE
,ENTITLEMENTS_API
,STORAGE_API
,RECORDS_CHANGE_PUBSUB_ENDPOINT
,GOOGLE_CLOUD_PROJECT
andSTORAGE_API
based on your deployment
- Open the app.yaml file in editor, and replace the YOUR-PROJECT-ID
- Google Documentation: https://cloud.google.com/cloud-build/docs/deploying-builds/deploy-appengine
- Edit the app.yaml
Cloud KMS Setup
Enable cloud KMS on master project
Create king ring and key in the master project
gcloud services enable cloudkms.googleapis.com
export KEYRING_NAME="csqp"
export CRYPTOKEY_NAME="registerService"
gcloud kms keyrings create $KEYRING_NAME --location global
gcloud kms keys create $CRYPTOKEY_NAME --location global \
--keyring $KEYRING_NAME \
--purpose encryption
Add Cloud KMS CryptoKey Encrypter/Decrypter role to the App Engine default service account of the master project through IAM - Role tab
Add Cloud KMS Encrypt/Decrypt role to the App Engine default service account of master project
License
Copyright © Google LLC Copyright © EPAM Systems
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.