Considerations for "labeling" orphan data
This issue is a request for input and collaboration with the Data Definitions team.
The manifest ingestion process is designed by default to prohibit the creation of orphaned records. For example, a Well Log that references a Wellbore that does not exist. However, this is a business decision and we must allow OSDU adopters the flexibility to implement OSDU in a way that best fits their desired use cases. One construct for consideration is providing a way to identify orphans within the data structure. As an example, during the manifest ingestion process, if a data element is determined to be an orphan, extra metadata could be added to the record when saving it.
The GCP/EPAM team raised this idea while working on manifest ingestion. We would like to engage the Data Definitions team for their input on whether the schema definitions should have a structure for labeling orphans. This request for input is not an R3 requirement.
The following detail was provided by Kateryna from EPAM.
Our suggestion is to include an optional parameter ‘Orphan” to each Master or WPC schema (or any entity that could be orphaned). This parameter could have an array of values that correspond to all the orphan attributes. E.g. for Well Log, it may look like this:
“Orphan”: [
“ServiceCompanyId”: “osdu:master-data:Organisation:BigOil”,
“ServiceCompanyId”: “osdu:reference-data:CurveUnit:km”
]