Migrate WKS to Anthos (new approach) (GONRG-3851) (!116) · Merge requests · OSDU / OSDU Data Platform / Data Flow / Data Enrichment / wks

Type of change

Bug Fix
Feature

Current MR adds the possibility to use WKS service with Anthos. The feature was implemented with usage of OSM, OBM and OQM libraries.

Does this introduce a change in the core logic?

[YES]

Does this introduce a change in the cloud provider implementation, if so which cloud?

AWS
Azure
GCP
IBM

Does this introduce a breaking change?

[YES]

What is the current behavior?

Storage works with KV, Blobs and messaging directly.

What is the new/expected behavior?

WKS service will use EPAM OSM, OBM, OQM mappers for data management flexibility

Any other useful information

Features of implementation

This is a universal solution created using EPAM OSM, OBM and OQM mappers technology. It allows you to work with various implementations of data stores and message brokers.

Limitations of the current version

In the current version, the mappers are equipped with several drivers to the stores and the message broker:

OSM (mapper for KV-data): Google Datastore; Postgres
OBM (mapper to Blob stores): Google Cloud Storage (GCS); MinIO
OQM (mapper to message brokers): Google PubSub; RabbitMQ

Extensibility

To use any other store or message broker, implement a driver for it. With an extensible set of drivers, the solution is unrestrictedly universal and portable without modification to the main code.

Mapper tuning mechanisms

This service uses specific implementations of DestinationResolvers based on the tenant information provided by the OSDU Partition service. A total of 6 resolvers are implemented, which are divided into two groups:

for universal technologies:

for Postgres: destination/resolver/PostgresTenantDestinationResolver.java
for MinIO: destination/resolver/MinioDestinationResolver.java
for RabbitMQ: destination/resolver/RabbitMqOqmDestinationResolver.java

Their algorithms are as follows:

incoming Destination carries data-partition-id
resolver accesses the Partition service and gets PartitionInfo
from PartitionInfo resolver retrieves properties for the connection: URL, username, password etc.
resolver creates a data source, connects to the resource, remembers the datasource
resolver gives the datasource to the mapper in the Resolution object

for native Google Cloud technologies:

for Datastore: destination/resolver/DatastoreTenantDestinationResolver.java
for GCS: destination/resolver/GcsDestinationResolver.java
for PubSub: destination/resolver/PubsubOqmDestinationResolver.java

Their algorithms are similar,

Except that they do not receive special properties from the Partition service for connection, because the location of the resources is unambiguously known - they are in the GCP project. And credentials are also not needed - access to data is made on behalf of the Google Identity SA under which the service itself is launched. Therefore, resolver takes only the value of the projectId property from PartitionInfo and uses it to connect to a resource in the corresponding GCP project.

Edited Dec 21, 2021 by Artem Dobrynin (EPAM)

Migrate WKS to Anthos (new approach) (GONRG-3851)