Skip to content

Index extended properties defined in property configurations

Zhibin Mai requested to merge index_extended_properties into master

This is one of the MRs for ADR: Configurable Index Extensions and De-Normalizations

It depends MR Make sure that the indexer-queue for azure forwards the ancestry_kinds... to support the solution that prevents infinite loop of re-index.

We introduced chasing mechanisms to solve the issue of the index order and provided a solution to prevent infinite loop of re-index.

How index chasing work 1. How to update the indexed documents of children records when a parent record is updated:

As we know, a parent kind can have many children kinds. For example, a Wellbore kind can have children kinds: WellLog, WellboreTrajectory, WellboreMarkerSet and etc. In most cases, not all children kinds need to extend the parent kind. And There is no standard/unique way for all children kinds that define the reference (e.g. data.WellboreId in WellLog record) to the parent record.

So we need to have an efficient way to find all the children that need to extend the parent record's properties.

  • When a child record is indexed and it needs to extend one or more properties from its parent records, a new property "data.AssociatedIdentities" as string array is created in the children's documents. The parent ids are inserted into this property.
  • When a parent record is indexed, at the end, it will use its id to search children records. Then one or multiple messages are constructed and sent out to simulate the record change events for all the children records that need to be re-indexed. Similar query will like this:
{ 
   kind: "*:*:*:*",
   query: "data.AssociatedIdentities:\"<parent record id>\""
}
  1. How to update the indexed documents of parent records when a child record is updated:

As child record has all the references (ids) to its parent records, in order to find out which parent record(s) should be re-indexed, the query as below can be used to find out whether there is any IndexPropertyConfiguration that refers to the child kind

{ 
   kind: "osdu:wks:reference-data--IndexPropertyPathConfiguration:*",
   query: "nested(data.Configurations, nested(data.Configurations.Paths, (RelatedObjectsSpec.RelationshipDirection: ParentToChildren AND RelatedObjectsSpec.RelatedObjectKind:\"<child kind>\")))"
}

The above search result will be cached. If there is any IndexPropertyConfiguration that defines "ParentToChildren" relationship, it will find out which parent reocrd(s) should be re-indexed. Then one or multiple messages are constructed and sent out to simulate the record change events for all the parent records that need to be re-indexed .

How to prevent infinite loop of re-index/index chasing

As we mentioned, one or more messages will be sent out to simulate the record change events in order to re-index the parent or children records. In the ADR, the Use Case 3 and Use Case 4 demonstrate the possibility that a circular reference can be defined in the IndexPropertyPathConfiguration.

Here is the normal record change event message when multiple WellLog records are updated:

{
  "data": "[{\"id\":\"opendes:work-product-component--WellLog:23b7dfde2c1349d58a0f97ae78bff9df\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}, {\"id\":\"opendes:work-product-component--WellLog:9b579416f23c4f36af4a00c10657babe\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}]",
  "attributes": {
    "data-partition-id": "{{data-partition-id}}"
  }
}

In order to prevent infinite loop of re-index, an extra information is piggyback when a event change message is sent out for the given records. The source kind that triggers the re-index of other records will be part of the message. In Case 3 example, the record change event message will be constructed as below:

{
  "data": "[{\"id\":\"opendes:work-product-component--WellLog:23b7dfde2c1349d58a0f97ae78bff9df\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}, {\"id\":\"opendes:work-product-component--WellLog:9b579416f23c4f36af4a00c10657babe\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}]",
  "attributes": {
    "data-partition-id": "{{data-partition-id}}",
    "ancestry_kinds": "osdu:wks:master-data--Wellbore:1.1.1"
  }
}

When the above message is received and the WellLog children records are re-indexed, the information in the "ancestry_kinds" can be used to prevent trigger the re-index of the parent Wellbore record.

The ancestry_kinds can include multiple kinds that are separated by comma. For example, Well record triggers the index of Wellbore record and Wellbore record triggers the index of WellLog records, the message can be:

{
  "data": "[{\"id\":\"opendes:work-product-component--WellLog:23b7dfde2c1349d58a0f97ae78bff9df\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}, {\"id\":\"opendes:work-product-component--WellLog:9b579416f23c4f36af4a00c10657babe\",\"kind\":\"osdu:wks:work-product-component--WellLog:1.2.0\",\"op\":\"update\"}]",
  "attributes": {
    "data-partition-id": "{{data-partition-id}}",
    "ancestry_kinds": "osdu:wks:master-data--Well:1.1.0,osdu:wks:master-data--Wellbore:1.1.1"
  }
}
Edited by Zhibin Mai

Merge request reports