Partition issueshttps://community.opengroup.org/osdu/platform/system/partition/-/issues2023-11-21T19:08:57Zhttps://community.opengroup.org/osdu/platform/system/partition/-/issues/35Add /liveness_check2023-11-21T19:08:57ZRiabokon Stanislav(EPAM)[GCP]Add /liveness_checkNeed to add the endpoint '/liveness_check' in order to verify the operational status of the Partition Service.Need to add the endpoint '/liveness_check' in order to verify the operational status of the Partition Service.M22 - Release 0.25Riabokon Stanislav(EPAM)[GCP]Riabokon Stanislav(EPAM)[GCP]https://community.opengroup.org/osdu/platform/system/partition/-/issues/31upgrade azure-storage SDK2023-01-18T21:31:34ZNur Sheikhupgrade azure-storage SDKIn partition service we are using the azure-storage sdk 8.6.5 from com.microsoft.azure package which is too old and not having much support. It iis advisable to use the latest sdk for com.azure package.In partition service we are using the azure-storage sdk 8.6.5 from com.microsoft.azure package which is too old and not having much support. It iis advisable to use the latest sdk for com.azure package.https://community.opengroup.org/osdu/platform/system/partition/-/issues/18partition-core shouldn't contain SPI implementations2021-11-08T10:08:02ZDmitrii Gerashchenkopartition-core shouldn't contain SPI implementationsPartitionServiceImplCache is implemented in the core module: https://community.opengroup.org/osdu/platform/system/partition/-/blob/master/partition-core/src/main/java/org/opengroup/osdu/partition/service/CachedPartitionServiceImpl.java
...PartitionServiceImplCache is implemented in the core module: https://community.opengroup.org/osdu/platform/system/partition/-/blob/master/partition-core/src/main/java/org/opengroup/osdu/partition/service/CachedPartitionServiceImpl.java
Partition-core shouldn't contain SPI implementations. SPI should be implemented in CSPs.
Also, it makes some problems within CSP if it needs to use some special logic or even if it doesn't need cache at all: https://community.opengroup.org/osdu/platform/system/partition/-/blob/master/provider/partition-aws/src/main/java/org/opengroup/osdu/partition/provider/aws/service/PartitionServiceDummyListCacheImpl.java#L24
CachedPartitionServiceImpl doesn't contain any complicated logic so it could be combined with PartitionServiceImpl within providers modules or even be removed for ones that don't need it.Dmitrii GerashchenkoDmitrii Gerashchenkohttps://community.opengroup.org/osdu/platform/system/partition/-/issues/16Partition service's (azure-provider) latency is more than 300 seconds2021-10-01T11:44:27ZDmitrii GerashchenkoPartition service's (azure-provider) latency is more than 300 secondsThere are latencies (more than 300 seconds) on Partition API (azure-provider).
An inspection showed that there is 2 minutes timeout for Azure TableStorage which can be the cause of the latencies.
10 minutes latency reproduced locally w...There are latencies (more than 300 seconds) on Partition API (azure-provider).
An inspection showed that there is 2 minutes timeout for Azure TableStorage which can be the cause of the latencies.
10 minutes latency reproduced locally with the following conditions:
1. Endpoints GET /api/partition/v1/partitions or /api/partition/v1/partitions/{partitionId}
2. Not data in cache.
3. Azure Table storage is unavailable or responding too slow.
4. Many requests to API (more than 500).
Presumably, if a cache became outdated during high-load many simultaneous requests are send to TableStorage.
All requests which were sent before TableStorage response caching will create new requests to TableStorage and will be waiting for response up to 2 minutes. Finally, the API latency grows.
The solution is to use a cluster lock during the request to TableStorage. It's a copy of this solution from the Entitlements repository:
https://community.opengroup.org/osdu/platform/security-and-compliance/entitlements/-/blob/master/provider/entitlements-v2-azure/src/main/java/org/opengroup/osdu/entitlements/v2/azure/service/GroupCacheServiceAzure.java#L81
@Qualifier("cachedPartitionServiceImpl") was removed to make the bean "CachedPartitionServiceImpl" overridable.
CachedPartitionServiceImpl (defined in partition-core) was redefined with ProviderCachedPartitionServiceImpl (defined in partition-azure).
CachedPartitionService interface was introduced to resolve ambiguities for beans CachedPartitionService and PartitionServiceImpl. Both of them inherit IPartitionService. Now CachedPartitionService resolves ambiguities instead of @Qualifier("cachedPartitionServiceImpl").
New code was tested with the same conditions and the latency didn't grow.Dmitrii GerashchenkoDmitrii Gerashchenkohttps://community.opengroup.org/osdu/platform/system/partition/-/issues/19MS CloudTableClient has not timeouts2021-10-01T11:44:22ZDmitrii GerashchenkoMS CloudTableClient has not timeoutsMS TableStorage's client - CloudTableClient uses default timeout settings.
The client can try to connect to the MS server for up to 2 minutes: 3 retry attempts with 30 seconds delay between attempts.
The MaximumExecutionTime is null.
I...MS TableStorage's client - CloudTableClient uses default timeout settings.
The client can try to connect to the MS server for up to 2 minutes: 3 retry attempts with 30 seconds delay between attempts.
The MaximumExecutionTime is null.
I created a dummy server and tested the case when MS TableStorage responds with latency. There is no timeout for a response in the client so the client could be blocked infinitely.
The client doesn't throw errors on long TableStorage's latencies what could be the cause of 504 errors for API consumers.
Also, it means that we can't see any exceptions even if MS TableStorage responds with latencies.Dmitrii GerashchenkoDmitrii Gerashchenko