Normalizer: meta[].unitOfMeasureID shouldbe preferred unit declaration
Reported by Marcus Ridgway:
The UoM Meta[] schema supports association of a Unit of Measure to one or more attributes in a JSON record. The core of the UoM schema is the unitOfMeasureID attribute which associates attributes defined in propertyNames to the ID of the UOM in the Unit of Measure Reference list e.g. for a Wellbore record
{
"kind": "Unit",
"name": "ft",
"persistableReference": "",
"propertyNames": [
"FacilitySpecifications[0].FacilitySpecificationQuantity",
"VerticalMeasurements[0].VerticalMeasurement"
],
"unitOfMeasureID": "osdu:reference-data--UnitOfMeasure:ft:"
}
The persistableReference attribute in meta[] is there to support storage of the full UoM Definition when unitOfMeasureID is not populated. E.g. for metres:
"persistableReference": "{"abcd":{"a":0.0,"b":1.0,"c":1.0,"d":0.0},"symbol":"m","baseMeasurement":{"ancestry":"L","type":"UM"},"type":"UAD"}",
Populating persistableReference is no longer required if the UnitOfMeasure Reference List is now fully populated i.e. IDs exist for all used UoMs. This removes any need to populate persistableReference. Regardless, populating persistableReference is extremely onerous for a number of reasons:
- does not adhere to one version of the truth - UoM need only be defined in the UoM Reference List; storing UoM definition in persistableReference in all records is the most extreme opposite
- all ETLs would be required to populate all the meta[] UoM definitions for all record types - the UoM definition is maintained in every ETL
- all OSDU records unnecessarily bloated by carrying all this redundant, duplicate persistableReference metadata within Meta[] in each and every record when it is centrally stored in the Reference List. This impacts storage requirements for OSDU records.
Problem: The Normalizer for the Search API for numeric values does not support API > SI Search when JSON records do not have persistableReference populated. The only data needing to be populated is unitOfMeasureID, but this is ignored by the Normalizer and instead requires persistableReference to be populated.
Require: Normalized to be extended to support the unitOfMeasureID populated in Meta. When populated, any content, including blank content is ignored, the Normalizer instead retrieves the persistableReference content from the UnitOfMeasure Reference List (source of truth for UoM definitions).
Comment from @gehrmann - means the normalizer needs to be enhanced. From the schema side of things we have said that if unitOfMeasureID
is populated it should supersede the persistableReference
which is the future goal. The AbstractMetaItem schema is historical and requires the persistableReference
to be set. It should however be sufficient to set "persistableReference": ""
when populating unitOfMeasureID
.
Originally reported as schema issue 624