Skip to content

Use ElasticSearch search_after to replace the implementation of query with cursor

Zhibin Mai requested to merge search_after into master

Type of change

  • Bug Fix
  • Feature

Please provide link to gitlab issue or ADR(Architecture Decision Record)
ADR Provide a new implementation for pagination with cursor to replace the existing implementation of the query with cursor

Does this introduce a change in the core logic?

  • [YES]

Does this introduce a change in the cloud provider implementation, if so which cloud?

  • AWS
  • Azure
  • Google Cloud
  • IBM

Does this introduce a breaking change?

  • [NO]

What is the current behavior?

The existing solution of the query with cursor has limitation on the max. number of open cursors (500 by default). Exception will be thrown by ElasticSearch once the limit is reached. Applications using the current query with cursor API have no control on such kind of error.

What is the new/expected behavior?

The new feature (using ElasticSearch search_after to support pagination) will share the same signature of the current query with cursor on the input and output. But it uses different ElasticSearch technique/feature, called search_after, to implement the same functionalities in order to avoid the max. cursor limitation and require less resources.

Have you added/updated Unit Tests and Integration Tests?

  1. We replicated the unit tests from the ScrollCoreQueryServiceImplTest.java to SearchAfterQueryServiceImplTest.java to demonstrate the compatibility between these two implementations.
  2. We created a way to share the integration tests between query with cursor and query with search_after to further demonstrate the compatibility between these two implementations.

Any other useful information

  1. The revised query with search_after is implemented on top of java client. Thanks @Stanislav_Riabokon to make that happened.
  2. We created a way to make query with search_after to reuse the integration tests of query with cursor. Any change on the integration tests of query with cursor will be applied to query with search_after.
  3. If the solution of query with search_after works as expected in terms of functionality as well as overcoming the limitation of query with cursor, we should replace the current query with cursor with this implementation with the following considerations:
  • The adoption of the query with search_after should be transparent to the existing applications of OSDU search
  • The adoption of the query with search_after should be transparent to developers with no change on API and documentation
  • To be able to conduct large scale of tests in production environment using existing applications of OSDU services before the replacement.

The goal of this MR is to find a feasible solution to replace the current implementation of the query with cursor without change on the existing API and documentation. Though we did functionality tests and basic performance tests, given it is so critical for the query with cursor API, we prefer to do some tests in real environment in order to ensure that the new solution is scalable and reliable. To support tests in real environments, we added two options to allow using new implementation:

  1. query parameter. This is used by test scripts or applications that are aware of the query parameter to do tests in a small scale
  2. feature flag. This allows us to let all applications of a given data partition to participate the tests transparently in large scale.
Edited by Zhibin Mai

Merge request reports

Loading