We found that the first batch of the pub/sub message can reach 1000 records (ids) in the implementation of Azure re-index. The reason is that in Azure the configuration, STORAGE_RECORDS_BY_KIND_BATCH_SIZE=1000 and the first batch of re-index record ids from storage service is not sliced to small batches before sending to Azure service bus.
As we know, indexer uses the storage batch API to retrieve storage records with standardized units and CRS. According to Storage Service document, the API fetches multiple records(maximum 20) from storage service at once. It could cause timeout or unexpected error if the indexer tries to fetch large size of records (e.g. 1000).
Another issue is that when Azure indexer-queue receives a message for re-index, it will make synchronous call to the indexer service. If the call can't be completed in less than 5 mins, it will trigger timeout from the service bus and the indexer queue will receive the same message up to 5 times. It introduces unnecessary load on the indexer-queue and indexer services if that happens.
In this MR, we make sure that the size of all the (re)index messages sent Azure service bus is no more than the PubSubBatchSize set in the PublisherConfig.