Skip to content

#88-Implement-indexing-respecting-x-collaboration-property

Description:

There is a feature implemented for two endpoints below to be able to work with x-collaboration header

/index-worker
/reindex/records

Both endpoints tasks are reflected in this ADRs/Stories
F1 (Java) Story 27 [Indexer]: Implement indexing respecting x-collaboration property
ADR - Project & Workflow Services - Core Services Integration - Search Service Support

User case

The main purpose of the first story is to allow to store Records in Elasticseach in new Namespace,
e.g.: you have a Record, but you would like to place it not in default namespace, but in a custom-named one.
To achieve that you need to address Storage's endpoint with a name of a certain custom-named namespace in http's header,

Example

Place a record in a custom-named namespace with this curl

curl --location --request PUT 'https://api.projectworkflow.adopt.paas.47lining.com/api/storage/v2/records' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: osdu' \
--header 'x-collaboration: id=a99cef48-2ed6-4beb-8a43-002373431001,application=pws' \
--header 'Authorization: Bearer <bearer>' \
--data-raw '[
    {
        "kind": "osdu:wks:master-data--CollaborationProjectDebug:1.0.0",
        "id": "osdu:master-data--CollaborationProjectDebug:0b0d639158654d54ab95e2da48f6ef88",
        "acl": {...},
        "legal": {...},
        "data": {.....}
    }
]'

Processes under the hood

Once this curl is executed it will trigger next steps to be done:
1 - Storage service will push a message to a queue in topic v2.
2 - Indexer-queue service will consume this message
3 - Indexer-queue service will send this message to Index service via http message with x-collab header to the /index-worker endpoint
4 - Index service will adjust mapping if needed and index a Record into Elasticsearch.
5 - In Elasticsearch the Record will be indexed with the id that is concatenated with x-collab value, this field belongs to Elastic doc's metadata. See an example

"_id": "osdu:index-property--Wellbore:testIngest1:a99cef48-2ed6-4beb-8a43-002373431f92"

At the same time the same Elastic doc's has a record with fields:

 {
   "_index":"osdu-wks-master-data--collaborationproject-1.0.0",
   "_type":"_doc",
   "_id":"osdu:master-data--CollaborationProject:275f1abfc7c44f35b5173e7c987638e8:a99cef48-2ed6-4beb-8a43-002373431f51",
   "_score":1.0,
   "_source":{
      "data":{
      .....
      "collaborationId":"a99cef48-2ed6-4beb-8a43-002373431f51",
      "id":"osdu:master-data--CollaborationProject:275f1abfc7c44f35b5173e7c987638e8",
      "kind": "osdu:wks:master-data--CollaborationProject:1.0.0",
      .....
 }

Points to be mentioned: you have kind (OSDU kind), id (OSDU record's id) and _id (Elasticsearch's id of the document stored)

To be able to find it with the means of Elastic you should use this curl

curl --location --request GET 'https://localhost:9200/test-indexer-index-property--wellbore-1.0.0/_search' \
--header 'Housekeeping: yes' \
--header 'Content-Type: application/json' \
--header 'Authorization: ***' \
--data '{
    "query": {
        "bool": {
            "must": [
                { "match": { "collaborationId": "a99cef48-2ed6-4beb-8a43-002373431f92" } },
                { "match": { "id": "osdu:index-property--Wellbore:testIngest1" } }
            ]
        }
    }
}'


To be able to find all the docs containing x-collab value you should use this body in curl

{
  "query": {
    "exists": {
      "field": "collaborationId"
    }
  }
}


To be able to find all the docs NOT containing x-collab value you should use this body in curl

{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "collaborationId"
        }
      }
    }
  }
}

The second endpoint description /reindex/records

The second story ADR - Project & Workflow Services - Core Services Integration - Search Service Support is about to trigger the reindex process for Records on Elasticsearch level with the respect to x-collaboration header. This header is expected to be received with an http request on that endpoint.

Example

curl --location 'https://api.projectworkflow.adopt.paas.47lining.com/api/indexer/v2/reindex/records' \
--header 'data-partition-id: osdu' \
--header 'user: 686-@testing.com' \
--header 'x-collaboration: id=a99cef48-2ed6-4beb-8a43-002373431f51,application=pws' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer xxx' \
--data '{
  "recordIds": [
    "osdu:master-data--CollaborationProject:275f1abfc7c44f35b5173e7c987638e8",   // existing record's id
    "osdu:master-data--CollaborationProject:da4394eebac44f49bc91866f2169ff82"    // not existing record's id
  ]
}'

The answer body example:

{
    "reIndexedRecords": [
        "osdu:master-data--CollaborationProject:275f1abfc7c44f35b5173e7c987638e8"
    ],
    "notFoundRecords": [
        "osdu:master-data--CollaborationProject:da4394eebac44f49bc91866f2169ff82"
    ]
}

Processes under the hood

1 - Once the http request with the x-collaboration header hit the endpoint the Index service will process the request and as the result will push to the queue this message.
2 - Indexer-queue service will consume this message
3 - Indexer-queue service will send this message to Index service via http message with x-collab header to the /index-worker endpoint
4 - Index service will process the message in its way.

Edited by Vladimir Korolevskii (EPAM)

Merge request reports

Loading