Skip to content

GONRG-2913: Added support for whitelist reference patterns

Type of change

  • Bug Fix
  • Feature

Does this introduce a change in the core logic?

  • [Yes]

Does this introduce a change in the cloud provider implementation, if so which cloud?

  • AWS
  • Azure
  • GCP
  • IBM

Updates description?

Closes https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/external-data-framework/-/issues/180

Introduces availability to whitelist references using custom regexp patterns to exclude those references from referential integrity validation.

For example, we found list of references which should be validated in default scenario:

[
    "osdu:reference-data--ResourceSecurityClassification:RESTRICTED",
    "osdu:master-data--Wellbore:1013",
    "osdu:reference-data--UnitOfMeasure:M",
    "osdu:reference-data--UnitOfMeasure:GAPI",
    "osdu:reference-data--UnitOfMeasure:US/F",
    "osdu:reference-data--UnitOfMeasure:G/C3",
    "osdu:reference-data--UnitOfMeasure:V/V"
]

After we may realize that some of these references should not be validated. So with new whitelist feature we can write our custom regexp patterns by which we could exclude needed patterns. Let's say we not interested in validation of these:

[
    "osdu:reference-data--UnitOfMeasure:GAPI",
    "osdu:reference-data--UnitOfMeasure:V/V"
]

First, we need to write somewhere our custom patterns. Suppose they will look like these (groups are required, because inner logic relies on them in current implementation):

\"(?P<key>CurveUnit)\":\s?\"(?P<value>[\w\d:-]*:GAPI:)\"
\"(?P<key>CurveUnit)\":\s?\"(?P<value>[\w\d:-]*:V\/V:)\"

They either can be parsed from file or from Airflow variable, for example (see osdu/platform/data-flow/ingestion/ingestion-dags!64 (merged)).

After that we can pass custom patterns as string to ManifestIntegrity on initialization. If they are valid we will receive new list of references for validation:

[
    "osdu:reference-data--ResourceSecurityClassification:RESTRICTED",
    "osdu:master-data--Wellbore:1013",
    "osdu:reference-data--UnitOfMeasure:M",
    "osdu:reference-data--UnitOfMeasure:US/F",
    "osdu:reference-data--UnitOfMeasure:G/C3"
]

As we can see, we whitelisted two references so they will be skipped on referential ingetrity validation stage.

Closes https://community.opengroup.org/osdu/platform/data-flow/ingestion/external-data-sources/external-data-framework/-/issues/180

Edited by Siarhei Khaletski (EPAM)

Merge request reports