feat: adding entitlements group caching and retries

This MR introduces entitlements groups caching implementation and SPI for each provider, similar to the current object storage interface.

Under heavy search load, we've noticed that requests will occasionally fail because:

  • Every request to policy service sends a request to entitlements putting it under heavy load, causing delayed responses/failures
  • A non-OK result from entitlements causes the Search request to fail

This MR addresses both root causes by introducing Entitlements re-tries using the Python tenacity library to improve the fault-tolerance of Search/Policy requests and adding a cache to avoid making requests to entitlements when not needed.

Notes

  • The default cache is a VM cache using Python lib cachetools, which is implemented for all CSPs
  • An SPI is provided, similar to object storage, for each provider to implement their own caching solution
  • AWS has implemented a shared caching mechanism with Redis, similar to Java services
  • I used a similar cache key generation algorithm as Search to result in more cache hits (Search is hit first before Policy) using Authorization and partition id as headers
  • Most of this code should be moved to the OSDU Python SDK repo to be re-used by other Python services, but that can be done at a later date when we have another service ready for the same capability
Edited by Marc Burnie [AWS]

Merge request reports

Loading