Use ElasticSearch search_after to replace the implementation of query with cursor
Type of change
-
Bug Fix -
Feature
Please provide link to gitlab issue or ADR(Architecture Decision Record)
ADR Provide a new implementation for pagination with cursor to replace the existing implementation of the query with cursor
Does this introduce a change in the core logic?
- [YES]
Does this introduce a change in the cloud provider implementation, if so which cloud?
-
AWS -
Azure -
Google Cloud -
IBM
Does this introduce a breaking change?
- [NO]
What is the current behavior?
The existing solution of the query with cursor
has limitation on the max. number of open cursors (500 by default). Exception will be thrown by ElasticSearch once the limit is reached. Applications using the current query with cursor
API have no control on such kind of error.
What is the new/expected behavior?
The new feature (using ElasticSearch search_after
to support pagination) will share the same signature of the current query with cursor
on the input and output. But it uses different ElasticSearch technique/feature, called search_after
, to implement the same functionalities in order to avoid the max. cursor limitation and require less resources.
Have you added/updated Unit Tests and Integration Tests?
- We replicated the unit tests from the
ScrollCoreQueryServiceImplTest.java
toSearchAfterQueryServiceImplTest.java
to demonstrate the compatibility between these two implementations. - We created a way to share the integration tests between
query with cursor
andquery with search_after
to further demonstrate the compatibility between these two implementations.
Any other useful information
- The revised
query with search_after
is implemented on top ofjava client
. Thanks @Stanislav_Riabokon to make that happened. - We created a way to make
query with search_after
to reuse the integration tests ofquery with cursor
. Any change on the integration tests ofquery with cursor
will be applied toquery with search_after
. - If the solution of
query with search_after
works as expected in terms of functionality as well as overcoming the limitation ofquery with cursor
, we should replace the currentquery with cursor
with this implementation with the following considerations:
- The adoption of the
query with search_after
should be transparent to the existing applications of OSDU search - The adoption of the
query with search_after
should be transparent to developers with no change on API and documentation - To be able to conduct large scale of tests in production environment using existing applications of OSDU services before the replacement.
The goal of this MR is to find a feasible solution to replace the current implementation of the query with cursor
without change on the existing API and documentation. Though we did functionality tests and basic performance tests, given it is so critical for the query with cursor API, we prefer to do some tests in real environment in order to ensure that the new solution is scalable and reliable. To support tests in real environments, we added two options to allow using new implementation:
- query parameter. This is used by test scripts or applications that are aware of the query parameter to do tests in a small scale
- feature flag. This allows us to let all applications of a given data partition to participate the tests transparently in large scale.