-
Smitha Manjunath authoredSmitha Manjunath authored
Indexer service
Table of contents
- Indexer service
- Introduction
- Indexer API access
- Version info endpoint
- Reindex
- Data Partition provision
- Schema Service adoption
- Troubleshoot Indexing Issues
Introduction
The Indexer API provides a mechanism for indexing documents that contain structured or unstructured data. Documents and indices are saved in a separate persistent store optimized for search operations. The indexer API can index any number of documents.
The indexer is indexes attributes defined in the schema. Schema can be created at the time of record ingestion in OSDU Data Platform via Schema Service. The Indexer service also adds number of OSDU Data Platform meta attributes such as id, kind, parent, acl, namespace, type, version, legaltags, index to each record at the time of indexing.
Indexer API access
-
Required roles
Indexer service requires that users (and service accounts) have dedicated roles in order to use it. Users must be a member of
users.datalake.viewers
orusers.datalake.editors
orusers.datalake.admins
,users.datalake.ops
roles can be assigned using the Entitlements Service. Please look at the API documentation for specific requirements.In addition to service roles, users must be a member of data groups to access the data.
-
Required headers
The OSDU Data Platform stores data in different partitions, depending on the different accounts in the OSDU system.
A user may belong to more than one account. As a user, after logging into the OSDU portal, you need to select the account you wish to be active. Likewise, when using the Search APIs, you need to specify the active account in the header called
data-partition-id
. The correctdata-partition-id
can be obtained from the CFS services. Thedata-partition-id
enables the search within the mapped partition. e.g.data-partition-id: opendes
-
Optional headers
The
correlation-id
is a traceable ID to track the journey of a single request. Thecorrelation-id
can be a GUID on the header with a key. It is best practice to provide the correlation-id so the request can be tracked through all the services.correlation-id: 1e0fef08-22fd-49b1-a5cc-dffa21bc0b70
If the service is initiating the request, an ID should be generated. If the correlation-id
is not provided, then a new ID will be generated by the service so that the request would be traceable.