EDS M17 Features and Fixes details
The significant features and fixes of EDS M17 are listed below:
Features:
-
PasswordCredentials OAuth Flow Type has been introduced, which allows EDS M17 to generate an access token for data providers using this flow type for authorization. To generate the access token, the parameters required are username, password, client ID, client secret, and scopes. The "FlowTypeID": "{{data_partition_id}}:reference-data--OAuth2FlowType:PasswordCredentials:" is added to the ConnectedSourceRegistryEntry record.https://gitlab.opengroup.org/osdu/subcommittees/ea/projects/extern-data/home/-/issues/267
-
EDS M17 now validates the expiry of the refresh token and auto-generates a new refresh token while updating the secret vault. If the refresh token value in the secret vault is expired, the eds_ingest fails to generate an access token, and the run fails. To handle this situation, eds_ingest verifies if the refresh token is expired and generates a new refresh token value following PasswordCredentials authentication grant type. The secret service then accesses the new refresh token value to update the old/expired value with the newly generated refresh token value. The "FlowTypeID": "{{data_partition_id}}:reference-data--OAuth2FlowType:RefreshTokenKeyName:" is added to the ConnectedSourceRegistryEntry (CSRE). The data provider for this feature is Katalyst. #19 (closed)
-
Parent data mapping is now handled in EDS M17, which includes keeping the source identifier ("id" of the parent data) in NameAlias of the parent record during ingestion into the operator environment. This helps the operator to find the source of each record and group them. When ingesting child data (e.g., Well log data) into the target environment, the child data is tagged to the right master data (e.g., Wellbore) in the target environment, and there is no name mismatch. This feature helps to identify a unique well using external rules between the external source and the target environment. https://gitlab.opengroup.org/osdu/subcommittees/ea/projects/extern-data/home/-/issues/268
Fixes:
A logger has been added to detail the Osdu_ingest run id and the sample-fetched data record. The message displayed in eds_ingest Airflow Logs includes Osdu_ingest Run Id and one Sample data fetched from the data provider with the text "Displaying only one Sample Record." #23 (closed)
The conversion of ConnectedSourceDataPartitionID to OnIngestionDataPartitionID for Array Datatype has been fixed. While ingestion, ConnectedSourceDataPartitionID (provider’s data partition id) is replaced with the OnIngestionDataPartitionID (operator’s data partition id) for all the parameters of the record with different datatypes (arrays, dicts). Each conversion is handled differently based on its datatype. For example, the conversion of string parameters from 'ResourceHomeRegionID': 'osdu:reference-data--OSDURegion:AWSEastUSA:' to 'ResourceHomeRegionID': 'opendes:reference-data--OSDURegion:AWSEastUSA:' is done similarly to the conversion of the array datatype. https://gitlab.opengroup.org/osdu/subcommittees/ea/projects/extern-data/home/-/issues/261
The Dynamic Schema Authority for Kind of CSRE, CSDJ, and ExternalReferenceValueMapping is now added from Airflow Variable. The constant file has Kind of few eds dependent schemas, such as ConnectedSourceRegistryEntry, ConnectedSourceDataJob, and ExternalReferenceValueMapping. The Schema_Authority value was static in the Kind, which is now replaced with the Schema_authority value fetched from the Airflow Variable. #22 (closed)
EDS now raises an exception when Airflow Variable is not found or None. Eds_ingest fails with KeyError if any of the important Airflow variable values are missing. #21 (closed)