ADR: Pagination Query API
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Background
Paginating over large query result is a common discovery workflow. Search service query API can return maximum 10K records, anything higher then this requires usage of Search service's query_with_cursor
API (POST /api/search/v2/query_with_cursor
). As OSDU Data Platform adoption has increased over milestone releases, users have repeatedly complained (Issues: 130, 156 etc.) on Search service's query_with_cursor
API reliability & performance. Some of the most common issues reported:
- During deep pagination over large result-set, API may throw error in the middle & users have to start over. It can be very time consuming and costly exercise.
- By default, each data-partition can have maximum
500
active cursors, if this limit is reached then API throws an exception. Users have repeatedly complained that even with light usage, this quota gets exhausted and they cannot make new cursor API call. - Cursor count per Search service request calculation is opaque. One Search service cursor request can potentially consume lot of cursors on the Search backend (Elasticsearch). It's very hard to provide users any guidance, how many concurrent cursor requests can be made on a data-partition.
- Cursor quota is a soft limit and can be potentially increased to mitigate issue. Quota increase will have impact on Search backend resource usage which can then degrade Search and Indexing latencies. Any resolution to latency requires Search backend resource scaling, thus increasing infrastructure and licensing cost.
Context & Scope
As we have looked over solutions to issues reported in earlier section, and found there are only two choices:
- We cannot reliably scroll over large result set so drop the support of scrolling over records higher then 10K.
- Provide a new Search service API that utilizes search_after API from Search backend (Elasticsearch).
We cannot limit maximum record that can be fetched from Search service as it may break existing consumer workflows. Search service must provide provide a reliable and performant API that will allow scrolling over all records in response, irrespective of their count.
search_after API does not suffer from the reliability issues that users have reported and recommended by Elasticsearch to be used in place of cursor/scroll API. Search service should add new API that makes use of search_after API from Elasticsearch.
Proposed solution
Search service should two new endpoints to support pagination:
- New endpoint to paginate via search_after API from Elasticsearch.
- New endpoint to free up pagination resources if next page is not needed.
API specification
openapi: 3.0.0
info:
description: Search service
version: 2.0.0
title: Search Service APIs
tags:
- name: Search
description: Service endpoints to search data in OSDU Data Platform
security:
- bearer: []
paths:
/pagination-query:
post:
tags:
- Search
summary: Queries using the input request criteria.
description: "The API supports full text search on string fields, range queries on date, numeric or string fields, along with geo-spatial search. Required
roles: 'users.datalake.viewers' or 'users.datalake.editors' or 'users.datalake.admins'. In addition, users must be a member of data
groups to access the data. It can be used to retrieve large numbers of results (or even all results) from a single search request, in much the
same way as you would use a cursor on a traditional database. API will respond with `nextCursor` if results are higher then maximum page size (1K). To request
next page, another request with same API that includes `nextCursor` value from last response must be supplied. All other fields on next pagination-query
request must be same and should be received by the service before cursor expires (defaults to 60s).
operationId: Pagination query
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryRequest"
responses:
"200":
description: Success
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryResponse"
"400":
description: Invalid parameters were given on request
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"401":
description: Unauthorized
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"403":
description: User not authorized to perform the action
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"502":
description: Search service scale-up is taking longer than expected. Wait 10
seconds and retry.
content:
application/json:
schema:
type: string
security:
- bearer: []
/pagination-query-cursor:
delete:
tags:
- Search
summary: Pagination resources should be freed up if not used anymore. Deletes pagination query cursor and frees up resources.
description: "Required roles: 'users.datalake.viewers' or 'users.datalake.editors' or 'users.datalake.admins'."
operationId: Delete pagination query cursor
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryCursorDeleteRequest"
responses:
"200":
description: Success
"400":
description: Invalid parameters were given on request
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"401":
description: Unauthorized
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"403":
description: User not authorized to perform the action
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"404":
description: Pagination query cursor not found
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"502":
description: Search service scale-up is taking longer than expected. Wait 10
seconds and retry.
content:
application/json:
schema:
type: string
security:
- bearer: []
components:
parameters:
data-partition-id:
name: data-partition-id
in: header
description: desired data partition id
required: true
schema:
type: string
securitySchemes:
bearer:
type: apiKey
name: Authorization
in: header
schemas:
PaginationQueryRequest:
type: object
required:
- kind
properties:
kind:
type: object
example: The kind of the record to query e.g. "tenant1:test:well:1.0.0" or ["tenant1:test:well:1.0.0", "tenant1:test:well:2.0.0"].
description: "'kind' to search"
query:
type: string
description: The query string in Lucene query string syntax.
returnedFields:
type: array
description: The fields on which to project the results.
items:
type: string
sort:
$ref: "#/components/schemas/SortQuery"
queryAsOwner:
type: boolean
example: false
description: The queryAsOwner switches between viewer and owner to return results
that you are entitled to view or results you are the owner of.
spatialFilter:
$ref: "#/components/schemas/SpatialFilter"
cursor:
type: string
description: Search context to retrieve next batch of results. It must be empty for the first request and subsequent requests must provide valid 'cursor'.
trackTotalCount:
type: boolean
description: Tracks accurate record count matching the query if 'true', partial count otherwise. Partial count queries are more performant. Default is 'false' and returns 10000 if matching records are higher than 10000.
example:
kind: osdu:welldb:wellbore:1.0.0
limit: 30
query: data.Basin:"Ft. Worth"
returnedFields:
- data.kind
queryAsOwner: false
cursor: <put a valid cursor or leave it blank for the first request>
PaginationQueryResponse:
type: object
properties:
nextCursor:
type: string
description: Search context to retrieve next batch of results. It's valid for 60s. Next pagination request must be recieved before it expires.
results:
type: array
items:
type: object
additionalProperties:
type: object
totalCount:
type: integer
format: int64
description: Returns accurate count if 'trackTotalCount' is 'true', partial count otherwise. Returns 10000 if matching records are higher than 10000 if partial count is requested.
PaginationQueryCursorDeleteRequest:
type: object
properties:
cursor:
type: string
description: Valid cursor for clean-up. Request must be received before cursor expiration.
ByBoundingBox:
type: object
required:
- bottomRight
- topLeft
properties:
topLeft:
$ref: "#/components/schemas/Point"
bottomRight:
$ref: "#/components/schemas/Point"
ByDistance:
type: object
required:
- point
properties:
distance:
type: number
format: double
example: 1500
description: The radius of the circle centered on the specified location. Points
which fall into this circle are considered to be matches.
minimum: 0
maximum: 9223372036854776000
point:
$ref: "#/components/schemas/Point"
ByGeoPolygon:
type: object
properties:
points:
type: array
description: Polygon defined by a set of points.
items:
$ref: "#/components/schemas/Point"
Point:
type: object
properties:
latitude:
type: number
format: double
example: 37.450727
description: Latitude of point.
minimum: -90
maximum: 90
longitude:
type: number
format: double
example: -122.174762
description: Longitude of point.
minimum: -180
maximum: 180
SortQuery:
type: object
properties:
field:
type: array
description: The list of fields to sort the results.
items:
type: string
order:
type: array
description: The list of orders to sort the results. The element must be either
ASC or DESC.
items:
type: string
SpatialFilter:
type: object
properties:
field:
type: string
description: geo-point field in the index on which filtering will be performed.
Use GET schema API to find which fields supports spatial search.
byBoundingBox:
$ref: "#/components/schemas/ByBoundingBox"
byDistance:
$ref: "#/components/schemas/ByDistance"
byGeoPolygon:
$ref: "#/components/schemas/ByGeoPolygon"
AppError:
type: object
properties:
code:
type: integer
format: int32
reason:
type: string
message:
type: string
Implementation details on Pagination Query API
First search_after API usage requires a PIT id to be created ahead of time and supplied on the search_after API call to Elasticsearch cluster. Pagination Query API should wrap both of these API calls in first pagination request.
If there are more than one page then search_after API call will respond with PIT id of next page and sort values along with results. PIT id and sort values are required to fetch next page. Pagination Query API response's nextCursor
attribute should be set to value that's a combination of both. PIT id is pretty long, it can be shortened & cached using existing hashing function before returning response to end user. nextCursor
attribute can then be set to: shortened(PID id) + base64.encode(sort value).
When Search receives next page request then pagination-query API will breakdown PID id and sort values by above mentioned mechanism and make next search_after call.
Consequences
- Existing
query_with_cursor
API (POST /api/search/v2/query_with_cursor) should be deprecated. - New Pagination Query API using search_after API on Elasticsearch should be introduced.
- New Delete Pagination Query Cursor API should be implemented.
- Search service tutorial should be updated with:
- New APIs documentation
- Introduction of a 'Best Practices' section with following suggestions:
- Migrate users from query_with_cursor API to new pagination-query API
- Remind users to call
DELETE /api/search/v2/pagination-query-cursor
API to avoid overloading system if cursor is no longer in use or next page is not needed.