ADR: Exclude indices of the system/meta data from the search results unless the indices (kinds) of the system/meta data are explicitly specified in the search query

It is mostly like that the applications or systems may need to have its system/meta data searchable via OSDU search but those system/meta data are not expected to be included in the search results of normal keyword search, for example, an application stores its system data in the storage under kind "xyz" (please ignore the kind syntax in this example)

  • When users try to search data with keyword "wellbore", the data from kind "xyz" should not be included in the search result if users do search as below:
Case 1:
{
  "kind": "*:*:*:*",
  "query": "wellbore"
} 
  • When application (workflow) tries to search its system data with keyword "wellbore", the data from kind "xyz" should be included in the search result if the kind "xyz" is explicitly specified in the search query, e.g.
Case 2:
{
  "kind": "xyz",
  "query": "wellbore"
} 

To achieve this objective and provide a general solution, we propose to use a reserved name in the "authority" or "source" field for kinds of the system/metadata.

  • If those kinds are not explicitly specified in the search query as the Case 1 above, the data from those kinds won't be included in the search result
  • If those kinds are explicitly specified in the search query as the Case 2 above, the data from those kinds will be included in the search result

The reserved name should be meaningful and odd (weird) enough to avoid naming conflict with the existing schema. It is an open question what it should be. Here a few proposals about the reserved name:

  • "system" -- it may be too common
  • "system-meta"
  • "system-meta-data" -- should not be common if it is used in as "authority"

Whether the reserved name in "authority" or "source" is another open question. Here is what we think:

Field Pro Con
authority it can be precisely filtered those indices it could cause name conflict among tenants in multi-tenants env when they share the same services
source it should not cause name conflict among tenants in multi-tenants env if each tenant has its own authority for its kinds it could be impossible precisely filtered those indices. If the entity type field has the same keyword, those indices will be filtered out too

Any input is welcomed before finalizing the solution.

Once we have a conclusion, Thomas will include this reserved keyword in the schema guide.

Edited by Zhibin Mai