Validation of Master data from Work-product-component or from another Master data
Example of intended checks -
- Check if referred Master data for WPC entry actually exists. Ex: Wellbore in WPC WellboreTrajectory.
- Check if referred Master data for another Master data entry actually exists. Ex: Well name/ID for wellbore.
- Check if referred WPC for another WPC entry actually exists. Ex: Seismic Horizon referring to SeismicTraceData.
- Check if reference to Master data is not skipped. Ex: Do not allow entering WellboreTrajectory without reference to an existing Wellbore. i.e. do not create “orphan”. (I have heard some arguments against this approach saying that we must allow ‘wildcat’ data. But that opens up opportunity for ‘problem’ and bad data. Perhaps there can be option for two types of implementations?)
- Check if reference to at least one of the possible Master Data links is not skipped. Ex: Do not allow entering SeismicTraceData without reference to one or more of the possible Master data links such as Acquisition or Processing or Interpretation. This kind of check may well call for ‘hand crafting’.
Wanted to include response from @alan.henson for the above points
Response to the first 3 points-
We check that all cited data exists based on the given kind’s schema definition. There are two implementations of this DAG operator coming. One that will reject data that cites data that does not exist and one DAG operator that permits invalid references. This approach provides optionality and avoids prescribing data management practices. OSDU might become an initial landing zone of all data as a first step before a process promoting that data to a clean and pristine second OSDU environment. We must support use cases that have OSDU serving in a capacity other than the golden persistence store.
Response to the last 2 points
This is only possible if the schema definition has indicated an attribute is mandatory. As a generic manifest ingestion process, we do not know what a Wellbore or WellboreTrajectory is. However, we do know what a mandatory attribute is according to the schema definition. If the schema definition has an attribute marked as mandatory, then the validation will require a value. The validation also requires the value to adhere to the format specified by the schema definition. Going beyond this check gets into the capabilities of DDMSs.