Mismatch on id length between Storage & Indexer service
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
Data discovery is one of the core data principle of OSDU Data Platform. Platform should ensure that all the data and metadata are discoverable to the point where all metadata is generally accessible to ensure awareness even when the data itself is generally not.
A record ingested by Storage service must be discovered by Search service. Once a record is ingested, it's indexed by indexer service.
Indexer service uses the same record-id created by Storage service to index record. It insures consistency across different services/workflows. Storage service doesn't enforce any limits on record-id length thus creating issue on indexing.
Records with ids longer then 512 bytes are not discoverable or queried by Search service.
Proposed solution
Storage service record ingestion API accepts user provided record-id or autogenerates if not provided. It should restrict user provided record-id length that breaks end-to-end search workflows.
Any record-id longer then 512 bytes should be rejected by Storage service. It should introduce a record-id length validation rule for the same.
Consequences
Create Record request with user provided record-id longer then 512 bytes will be rejected. These changes may seem like breaking change for consumer application if they are creating record with such record-id, though in reality they are making consumer applications aware of already broken end-to-end discovery/search workflow due to indexing issue. Consumers are required to handle this new validation rule.