Skip to content

Array of Objects support by Indexer (GONRG-2028)

Rustam Lotsmanenko (EPAM) requested to merge array-of-objects into master

Description:

Fix for #16

Added optional "x-osdu-indexing" property to Schema item ex.:

                        "properties": {
                            "Markers": {
                                "x-osdu-indexing": {
                                "type": "nested"
                                 },
                                "type": "array",
                                "items": {
                                    "type": "object",

Types behavior:

[]object

(will be used by default if property "x-osdu-indexing" not specified) :
Inner schema items won't be processed with schema converting:

                    {
                        path = ArrayItem,
                        kind = []object
                    }

Index type will be defined as object, without specifying inner properties:

                        "ArrayItem": {
                            "type": "object"
                        },

Objects array data will be pushed to elastic "as is" without any parsing:

        "ArrayItem": [{
                "InnerProperty": "anyvalue",
            }, {
                "InnerProperty": "anyvalue"
            }]

Querying by "ArrayItem.InnerProperty" won't be possible but data will return for other requests.

flattened :

Inner schema items won't be processed with schema converting:

                    {
                        path = ArrayItem,
                        kind = flattened
                    }

Index type will be defined as flattened, without specifying inner properties:

                        "ArrayItem": {
                            "type": "flattened"
                        },

Objects array data will be pushed to elastic "as is" as it treats all values as keywords and does not provide full search functionality:

        "ArrayItem": [{
                "InnerProperty": "anyvalue",
            }, {
                "InnerProperty": "anyvalue"
            }]

Querying by "ArrayItem.InnerProperty" will be possible but with limitations (https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html)

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "data.ArrayItem.InnerProperty": "anyvalue"
                    }
                },
                {
                    "match": {
                        "data.ArrayItem.OtherProperty": "anyvalue"
                    }
                }
            ]
        }
    }
}

nested :

Inner schema items will be processed

{
        path = ArrayItem,
        kind = nested,
        properties = [{
                path = InnerProperty,
                kind = double
            },{
                path = OtherProperty,
                kind = string
            }]

Index type will be defined as nested, with inner properties:

                        "ArrayItem": {
                            "type": "nested",
                            "properties": {
                                "InnerProperty": {
                                    "type": "text"
                                },
                                "OtherProperty": {
                                    "type": "double"
                                },

Objects inner array data will mapped with StorageIndexerPayloadMapper.class according it types:

        "ArrayItem": [{
                "InnerProperty": null,
                "OtherProperty": 0.0
            },

Querying by "ArrayItem.InnerProperty" will be possible, every nested object will be treated as separate object.

{
    "query": {
        "nested": {
            "path": "data.ArrayItem",
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "data.ArrayItem.InnerProperty": "any"
                            }
                        },
                        {
                            "match": {
                                "data.ArrayItem.OtherProperty": 0.0
                            }
                        }
                    ]
                }
            }
        }
    }
}

How to test:

It can be tested with several requests:

  1. Update schema with array objects type:
curl --location --request PUT 'https://os-schema/api/schema-service/v1/schema/' \
--header 'Data-Partition-Id: <data-patition>' \
--header 'Authorization: <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
.....
    "schema": {
.....
                        "ArrayItem": {
                            "x-osdu-indexing": {
                                "type": "nested"
                            },
                            "type": "array",
                            "items": {
                                "type": "object",
  1. Re-index kind for that schema
curl --location --request POST 'indexer/api/indexer/v2/reindex?force_clean=true' \
--header 'Data-Partition-Id: <data-patition>' \
--header 'Authorization: <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
  "kind":"<updated kind>"
}'
  1. Querying with Search service currently not implemented, search request can be performed directly with elasticsearch
{
    "query": {
        "nested": {
            "path": "data.Markers",
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "data.Markers.MarkerName": "North Sea Supergroup"
                            }
                        },
                        {
                            "match": {
                                "data.Markers.MarkerMeasuredDepth": 0.0
                            }
                        }
                    ]
                }
            }
        }
    }
}

Changes include:

  • Breaking change (a change that is not backward-compatible and/or changes current functionality).

Changes in:

  • Common code

Dev Checklist:

  • Added Unit Tests, wherever applicable.
  • Updated the Readme, if applicable.
  • Existing Tests pass
  • Verified functionality locally
  • Self Reviewed my code for formatting and complex business logic.

Other comments:

Consequences: Common API models must be changed in os-core-common
osdu/platform/system/lib/core/os-core-common!67 (merged)
To support nested queries Search service should implement nested query.
Functionality was not tested with schemas v2
v2 schema models must be changed, previous "flat" schemas doesn't fit for this changes:
Schema model changes example:

{
        path = LineageAssertions,
        kind = []object
    }, {
        path = Tags,
        kind = []string
    }, {
        path = Name,
        kind = string
    }, {
        path = Markers,
        kind = nested,
        properties = [{
                path = NegativeVerticalDelta,
                kind = double
            }, {
                path = SurfaceDipAngle,
                kind = double
            }, {
                path = FeatureTypeID,
                kind = string
            }, {
                path = MarkerInterpreter,
                kind = string
            }, {
Edited by Rustam Lotsmanenko (EPAM)

Merge request reports