Mismatch on id length between Storage & Indexer service

Status
Context & Scope
Proposed solution
Consequences

Status

Context & Scope

Data discovery is one of the core data principle of OSDU Data Platform. Platform should ensure that all the data and metadata are discoverable to the point where all metadata is generally accessible to ensure awareness even when the data itself is generally not.

A record ingested by Storage service must be discovered by Search service. Once a record is ingested, it's indexed by indexer service.

Indexer service uses the same record-id created by Storage service to index record. It insures consistency across different services/workflows. Storage service doesn't enforce any limits on record-id length thus creating issue on indexing.

Records with ids longer then 512 bytes are not discoverable or queried by Search service.

Back to TOC

Proposed solution

Storage service record ingestion API accepts user provided record-id or autogenerates if not provided. It should restrict user provided record-id length that breaks end-to-end search workflows.

Any record-id longer then 512 bytes should be rejected by Storage service. It should introduce a record-id length validation rule for the same.

Back to TOC

Consequences

Create Record request with user provided record-id longer then 512 bytes will be rejected. These changes may seem like breaking change for consumer application if they are creating record with such record-id, though in reality they are making consumer applications aware of already broken end-to-end discovery/search workflow due to indexing issue. Consumers are required to handle this new validation rule.

Back to TOC

Edited Sep 29, 2024 by Neelesh Thakur