ADR: CosmosDb saturation/throttling when records reach too many versions
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
ISSUE: Storage service stability issues due to too many versions of records.
User behavior that causes this issue: Creating a lot of versions for the same record ID. When multiple applications/teams do this long enough, we have too many versions for many records. There are no checks in place to prevent this scenario. We eventually hit infrastructure limits (i.e. CosmosDb document max size 2MB) but observe service instability much before.
Why is this a problem: Record versions are stored as part of record metadata. This is part of the gcsVersionPaths
array. Each version is a string that represents the full path to the version's blob location. Record metadata is stored in CosmosDb. While CosmosDb has a hard size limit (2MB) for each document, this size is already too big when RU usage is considered. If we have hundreds or thousands of such records being updated, the total RU consumed is very high, incurring huge costs. This scenario poorly impacts service latency and availability. While not ideal, it is quite possible for applications to create versions of the same record for their workflows.
For reference, here are some preliminary observations on the number of versions, size of the document and RU consumed to perform an UPSERT on a single document (note that the number of versions is not an absolute indicator to say how much RU will be consumed in performing an UPSERT, because its the size of the document that matters, and each version string can be of different length. One can fit a lot more versions if each version's length is small. However, as we stand today, it is the only metadata property that is causing documents to be big).
~1500 versions, ~300 RU consumed, ~243kb file size
~1500 versions, ~370 RU consumed, ~300kb file size
~3800 versions, ~1250 RU consumed, ~750kb file size
~5300 versions, ~1253 RU consumed, ~880kb file size
~9850 versions, ~2502 RU consumed, ~1.3mb file size
It is quite easy to have a few hundred or thousand records cripple the system once the records reach certain number of versions.
CLARIFICATION: The issue we observed is more specific to the Azure use case. Infrastructure limitations (i.e. cost to access a large document, hard limit on the size of the document) may vary per CSP (i.e. 2MB for CosmosDb, 1MB for GCP datastore). Other CSPs may see this issue once the number of versions reaches a certain number.
Tradeoff Analysis
It is clear we want to limit the number of record versions. We see 2 ways to achieve this.
- Set a hard limit on the number of versions on each record (say 1000) (preferred approach).
- Pros: Easy to implement, no behind-the-scenes magic.
- Cons: Breaking change for the existing workflows, when their records already have more than 1000 versions. Needs advance notice of breaking change and time for teams to update the workflows.
We can roll this out by first introducing a deleteVersion
API in Storage that would give users time to delete older versions by themselves before breaking change is introduced so they don't break immediately.
- Only keep 1000 recent versions. For new records, this would mean actively start deleting the oldest version once we reach 1000 versions. For existing records with more than 1000 versions, this would mean cleaning up all older versions.
- Pros: Older versions are cleaned up for users automatically.
- Cons: Still a breaking change as older versions would get deleted automatically. Involves behind-the-scenes cleanup of older versions. For records that currently have more than 1000 records, this includes all remaining versions. There can be failure scenarios with cleanup and performance implications.
Consequences
Storage will introduce a limit on the number of versions a record can have. Depending on the solution we choose, API will either fails after n versions (hard limit) OR older versions will get deleted automatically.