Draft: Address `esriOid` inaccuracies
Type of change
-
Bug Fix -
Feature
Please provide link to gitlab issue or ADR(Architecture Decision Record)
#609
Does this introduce a change in the core logic?
- [YES/NO]
Does this introduce a change in the cloud provider implementation, if so which cloud?
-
AWS -
Azure -
GCP -
IBM
Does this introduce a breaking change?
- [YES/NO]
What is the current behavior?
- Currently, the esriOid field will begin tracking records at 2, and increment from there.
- During batch loads, the esriOid incrementation logic assumes that 100% of records in a batch were loaded, so if batchSize is 100 and only 50 records succeeded, the second batch will have esriOid start at 101.
What is the new/expected behavior?
- During a batch, esriOid is only incremented after it is first stored to the record.
- This resolves issue of esriOid appearing to start at 2 instead of 1.
- At the start of every batch, after a feature set has been prepared, the starting esriOid for the batch will be dynamically calculated from the cache's current size.
- This resolves issue where previous esriOid assumed the last batch was 100% successful, leading the starting esriOid for each batch to be a multiple of the batchSize regardless of how many records failed ingestion.
Have you added/updated Unit Tests and Integration Tests?
Any other useful information
Intentionally did not use getMaxObjectId
since it uses the MAX(esriOid)
call on a field which is not indexed, meaning the performance is slower than getting the total cache size and incrementing by 1. Since every record will have an esriOid, getting cache size and adding 1 should be equivalent to calling MAX(esriOid).
We should consider reworking or removing the getMaxObjectId()
function during incremental refresh, especially if the esriOid will be automatically resolved in the putFeatures()
function. Or perhaps we should consider indexing esriOid.
Edited by Levi Remington