Refactor queryRecordsInBatch to broadly support varying batch sizes (!292) · Merge requests · OSDU Software / OSDU Data Platform / Consumption Services / GeoSpatial

Type of change

Please provide link to gitlab issue or ADR(Architecture Decision Record)
#577 (closed)

batchSize may be any value greater than 0 and the batches will adjust accordingly

batchSize dictates the number of records read from OSDU before they are turned over for processing by the Transformer
- Lifecycle of a batch:
  1. Ingestion from OSDU
    - Ingestion through OSDU Search, which is capped at a limit of 1000. So if batchSize is greater than 1000, we must sub-batch the ingestion queries until total ingested records has met the batchSize
    - The sub-batching is represented by Search#queryRecordsInBatch, whereas the larger batch life cycle is captured by FeatureCacheSynchronizerHelper#synchronizeInBatch
  2. Process all records (conversion to GeoJSON, etc.)
  3. Load records into Ignite Cache
batchSize must be set on the Transformer level, but can optionally be set on a per-kind level
- If batchSize is set on a kind, it overrides the batchSize set by the Transformer

Example:

Configuration: batchSize is 1005
Batch Lifecycle: FeatureCacheSynchronizerHelper#synchronizeInBatch will call getData() on a kind with the specified batchSize of 1005
- Per-batch sub-batching: Search#queryRecordsInBatch will attempt to ingest 1005 records from OSDU, but must do so with a max limit of 1000.
  1. First will make a query to retrieve 1000 records
  2. Use resulting cursor to retrieve the next 5 records
  3. If cursor has expired, will query with an offset of 1000 and a limit of 5 to retrieve the next 5 records.
Batch has been collected, and is now processed in bulk
The next batch lifecycle of 1005 is started
These batches go on until there are no more records to ingest

During my testing, GLAB OSDU Search was very slow, taking over 40 seconds to query 1000 Wellbore records.
With a batchSize over 1000, I found the cursor would frequently expire and our code did not have a fallback.

Edited Oct 22, 2024 by Levi Remington