ADR - Project & Workflow Services - Core Services Integration - Search Service Support

This ADR focuses on search service support for enabling Project & Workflow Services

Status

Context & Scope

Following on from this ADR that brought the idea that the same record instances could live in multiple namespaces or collaborations at the same time Search service also needs to adopt the same feature.

Like the previous ADR, this needs to be implemented in a non-breaking way that can be release controlled from a feature flag.

Index Solution

Record changed V2 is already being published from Storage service when the collaboration feature is enabled.

Using the same feature flag the Indexer service should start to consume the Record Changed V2 event if it is enabled instead of Record Changed V1. The ID of the generated document in Search backend (Elasticsearch) should then be a combination of both Record Id + Collaboration ID on the message.

We will also add a new collaboration property to indexed documents that has the ID of the collaboration the document belongs to (if any). This can be used for search queries, however it should not be returned to the user by default as it is not a part of the Storage Record in SoR.

Example Record change V2 message

"message": {
      "data": [
         {
            "id": "opendes:inttest:1674654754283",
            "kind": "opendes:wks:inttest:1.0.1674654754283",
            "op": "create",
            "version": 1673284431169293,
            "modifiedBy": "projnwrkflwssvs@osdu.org"
         }
      ],
      "account-id": "opendes",
      "data-partition-id": "opendes",
      "correlation-id": "2715a1b8-2ffb-406f-839c-6e6bfed27e5c",
      "x-collaboration": "id=abcd-12345-efghij-67890-klmn"
   }

Elasticsearch document

ID: <id + collaboration-id>
Collaboration: <collaboration-id>

This allows multiple instances of the same Record ID to exist in Search, 1 per collaboration.

Search solution

Search service should support the new x-collaboration header defined in the original Storage Service ADR when the feature flag is enabled.

If a collaboration id property is defined on a search request, Search service will automatically add that filter to the query meaning only documents in that collaboration can be returned to the user.

Example searching for all records not assigned to any collaboration (same as current behavior)

curl -X 'POST' \
  '<osdu>/api/search/v2/query' \
  -header  'data-partition-id: opendes' \
  --header 'authorization: Bearer <JWT>' \
  --header  'Content-Type: application/json' \
  --data '{ kind: *:*:*:*, query: "", limit: 10
          }'\
  -- data-raw

Example search for all records assigned in collaboration abcd-12345-efghij-67890-klmn

curl -X 'POST' \
  '<osdu>/api/search/v2/query' \
  -header  'data-partition-id: opendes' \
  --header 'authorization: Bearer <JWT>' \
  --header  'Content-Type: application/json' \
  --header  'x-collaboration: id=abcd-12345-efghij-67890-klmn' \
  --data '{ kind: *:*:*:*, query: "", limit: 10
          }'\
  -- data-raw

Decision

Rationale

Consequences

This is a non breaking change.
All features are enabled via a feature flag.
Search service will start to optionally use the x-collaboration header already defined to scope requests to specific collaborations.
Indexer can store the same Record ID multiple times, once per collaboration.
Indexer service's Re-index API needs to conform to the Record Changed V2 format when the feature is enabled.
Indexer service's Index clean-up API should remove records when collaboration context is provided.
Indexer-queue service's record change event processor should conform to Record Changed V2 format.

Open Questions:

Indexer service should forward x-collaboration header to all Storage service requests as it needs to index records in specific collaboration but it also have dependency on Schema service, do we expect schema service to honor x-collaboration? Do we expect different schema in different collaborations?

When to revisit

Tradeoff Analysis - Input to decision

Alternatives and implications

Decision criteria and tradeoffs

Decision timeline

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information