Search issueshttps://community.opengroup.org/osdu/platform/system/search-service/-/issues2024-03-12T18:48:40Zhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/160Local running of unit tests is unsuccessful.2024-03-12T18:48:40ZRiabokon Stanislav(EPAM)[GCP]Local running of unit tests is unsuccessful.When attempting to execute JUnit tests in the local environment for the core part of the search service, we observe 11 unsuccessful tests:
```
Results :
Tests in error:
testQueryIndex_whenNoCursorInSearchQueryAndSearchHitsIsEmpty(or...When attempting to execute JUnit tests in the local environment for the core part of the search service, we observe 11 unsuccessful tests:
```
Results :
Tests in error:
testQueryIndex_whenNoCursorInSearchQueryAndSearchHitsIsEmpty(org.opengroup.osdu.search.provider.impl.ScrollCoreQueryServiceImplTest): Error processing search request
testQueryIndex_whenSearchGives500_thenThrowException(org.opengroup.osdu.search.provider.impl.ScrollCoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<org.junit.ComparisonFailure>
testQueryBase_whenClientSearchResultsInElasticsearchStatusException_statusServiceUnavailable_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_whenClientSearchResultsInElasticsearchStatusException_statusNotFound_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_whenClientSearchResultsInElasticsearchStatusException_statusBadRequest_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_whenClientSearchResultsInElasticsearchStatusException_statusTooManyRequests_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_SocketTimeoutException_ListenerTimeout_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_whenUnsupportedSortRequested_statusBadRequest_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_IOException_ListenerTimeout_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
testQueryBase_IOException_RespopnseTooLong_throwsException(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Unexpected exception, expected<org.opengroup.osdu.core.common.model.http.AppException> but was<java.lang.AssertionError>
should_return_CorrectQueryResponseforIntersectionSpatialFilter(org.opengroup.osdu.search.provider.impl.CoreQueryServiceImplTest): Error processing search request
Tests run: 200, Failures: 0, Errors: 11, Skipped: 10
```
It seems likely that these issues share a common underlying cause:
`if (!autocompleteFeatureFlag.isFeatureEnabled(AUTOCOMPLETE_FEATURE_NAME) || suggestPhrase == null || suggestPhrase == "") {
return null;
}`
For example, 'should_return_CorrectQueryResponseforIntersectionSpatialFilter'
```
@Test(expected = AppException.class)
public void testQueryBase_IOException_RespopnseTooLong_throwsException() throws IOException {
IOException exception = mock(IOException.class);
doReturn(new ContentTooLongException(null)).when(exception).getCause();
doReturn("dummyMessage").when(exception).getMessage();
doThrow(exception).when(client).search(any(), any(RequestOptions.class));
try {
sut.queryIndex(searchRequest);
} catch (AppException e) {
int errorCode = 413;
String errorMessage = "Elasticsearch response is too long, max is 100Mb";
validateAppException(e, errorCode, errorMessage);
throw (e);
}
}
```
It appears that there is a missing property or stub for the 'AUTOCOMPLETE_FEATURE_NAME' feature.
Besides, I could not find tests in https://community.opengroup.org/osdu/platform/system/search-service/-/merge_requests/624/M23 - Release 0.26https://community.opengroup.org/osdu/platform/system/search-service/-/issues/157ADR: Pagination Query API2024-02-27T20:48:46ZNeelesh ThakurADR: Pagination Query API<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Paginating over large query result is a common discovery workflow. Search service query API can return ...<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Paginating over large query result is a common discovery workflow. Search service query API can return maximum 10K records, anything higher then this requires usage of Search service's `query_with_cursor` API (`POST /api/search/v2/query_with_cursor`). As OSDU Data Platform adoption has increased over milestone releases, users have repeatedly complained (Issues: [130](https://community.opengroup.org/osdu/platform/system/search-service/-/issues/130), [156](https://community.opengroup.org/osdu/platform/system/search-service/-/issues/156) etc.) on Search service's `query_with_cursor` API reliability & performance. Some of the most common issues reported:
- During deep pagination over large result-set, API may throw error in the middle & users have to start over. It can be very time consuming and costly exercise.
- By default, each data-partition can have maximum `500` active cursors, if this limit is reached then API throws an exception. Users have repeatedly complained that even with light usage, this quota gets exhausted and they cannot make new cursor API call.
- Cursor count per Search service request calculation is opaque. One Search service cursor request can potentially consume lot of cursors on the Search backend (Elasticsearch). It's very hard to provide users any guidance, how many concurrent cursor requests can be made on a data-partition.
- Cursor quota is a soft limit and can be potentially increased to mitigate issue. Quota increase will have impact on Search backend resource usage which can then degrade Search and Indexing latencies. Any resolution to latency requires Search backend resource scaling, thus increasing infrastructure and licensing cost.
# Context & Scope
As we have looked over solutions to issues reported in earlier section, and found there are only two choices:
- We cannot reliably scroll over large result set so drop the support of scrolling over records higher then 10K.
- Provide a new Search service API that utilizes [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API from Search backend (Elasticsearch).
We cannot limit maximum record that can be fetched from Search service as it may break existing consumer workflows. Search service must provide provide a reliable and performant API that will allow scrolling over all records in response, irrespective of their count.
[search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API does not suffer from the reliability issues that users have reported and recommended by Elasticsearch to be used in place of cursor/scroll API. Search service should add new API that makes use of [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API from Elasticsearch.
[Back to TOC](#TOC)
# Proposed solution
Search service should two new endpoints to support pagination:
- New endpoint to paginate via [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API from Elasticsearch.
- New endpoint to free up pagination resources if next page is not needed.
<details>
<summary>API specification</summary>
```yaml
openapi: 3.0.0
info:
description: Search service
version: 2.0.0
title: Search Service APIs
tags:
- name: Search
description: Service endpoints to search data in OSDU Data Platform
security:
- bearer: []
paths:
/pagination-query:
post:
tags:
- Search
summary: Queries using the input request criteria.
description: "The API supports full text search on string fields, range queries on date, numeric or string fields, along with geo-spatial search. Required
roles: 'users.datalake.viewers' or 'users.datalake.editors' or 'users.datalake.admins'. In addition, users must be a member of data
groups to access the data. It can be used to retrieve large numbers of results (or even all results) from a single search request, in much the
same way as you would use a cursor on a traditional database. API will respond with `nextCursor` if results are higher then maximum page size (1K). To request
next page, another request with same API that includes `nextCursor` value from last response must be supplied. All other fields on next pagination-query
request must be same and should be received by the service before cursor expires (defaults to 60s).
operationId: Pagination query
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryRequest"
responses:
"200":
description: Success
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryResponse"
"400":
description: Invalid parameters were given on request
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"401":
description: Unauthorized
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"403":
description: User not authorized to perform the action
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"502":
description: Search service scale-up is taking longer than expected. Wait 10
seconds and retry.
content:
application/json:
schema:
type: string
security:
- bearer: []
/pagination-query-cursor:
delete:
tags:
- Search
summary: Pagination resources should be freed up if not used anymore. Deletes pagination query cursor and frees up resources.
description: "Required roles: 'users.datalake.viewers' or 'users.datalake.editors' or 'users.datalake.admins'."
operationId: Delete pagination query cursor
parameters:
- $ref: "#/components/parameters/data-partition-id"
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/PaginationQueryCursorDeleteRequest"
responses:
"200":
description: Success
"400":
description: Invalid parameters were given on request
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"401":
description: Unauthorized
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"403":
description: User not authorized to perform the action
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"404":
description: Pagination query cursor not found
content:
application/json:
schema:
$ref: "#/components/schemas/AppError"
"502":
description: Search service scale-up is taking longer than expected. Wait 10
seconds and retry.
content:
application/json:
schema:
type: string
security:
- bearer: []
components:
parameters:
data-partition-id:
name: data-partition-id
in: header
description: desired data partition id
required: true
schema:
type: string
securitySchemes:
bearer:
type: apiKey
name: Authorization
in: header
schemas:
PaginationQueryRequest:
type: object
required:
- kind
properties:
kind:
type: object
example: The kind of the record to query e.g. "tenant1:test:well:1.0.0" or ["tenant1:test:well:1.0.0", "tenant1:test:well:2.0.0"].
description: "'kind' to search"
query:
type: string
description: The query string in Lucene query string syntax.
returnedFields:
type: array
description: The fields on which to project the results.
items:
type: string
sort:
$ref: "#/components/schemas/SortQuery"
queryAsOwner:
type: boolean
example: false
description: The queryAsOwner switches between viewer and owner to return results
that you are entitled to view or results you are the owner of.
spatialFilter:
$ref: "#/components/schemas/SpatialFilter"
cursor:
type: string
description: Search context to retrieve next batch of results. It must be empty for the first request and subsequent requests must provide valid 'cursor'.
trackTotalCount:
type: boolean
description: Tracks accurate record count matching the query if 'true', partial count otherwise. Partial count queries are more performant. Default is 'false' and returns 10000 if matching records are higher than 10000.
example:
kind: osdu:welldb:wellbore:1.0.0
limit: 30
query: data.Basin:"Ft. Worth"
returnedFields:
- data.kind
queryAsOwner: false
cursor: <put a valid cursor or leave it blank for the first request>
PaginationQueryResponse:
type: object
properties:
nextCursor:
type: string
description: Search context to retrieve next batch of results. It's valid for 60s. Next pagination request must be recieved before it expires.
results:
type: array
items:
type: object
additionalProperties:
type: object
totalCount:
type: integer
format: int64
description: Returns accurate count if 'trackTotalCount' is 'true', partial count otherwise. Returns 10000 if matching records are higher than 10000 if partial count is requested.
PaginationQueryCursorDeleteRequest:
type: object
properties:
cursor:
type: string
description: Valid cursor for clean-up. Request must be received before cursor expiration.
ByBoundingBox:
type: object
required:
- bottomRight
- topLeft
properties:
topLeft:
$ref: "#/components/schemas/Point"
bottomRight:
$ref: "#/components/schemas/Point"
ByDistance:
type: object
required:
- point
properties:
distance:
type: number
format: double
example: 1500
description: The radius of the circle centered on the specified location. Points
which fall into this circle are considered to be matches.
minimum: 0
maximum: 9223372036854776000
point:
$ref: "#/components/schemas/Point"
ByGeoPolygon:
type: object
properties:
points:
type: array
description: Polygon defined by a set of points.
items:
$ref: "#/components/schemas/Point"
Point:
type: object
properties:
latitude:
type: number
format: double
example: 37.450727
description: Latitude of point.
minimum: -90
maximum: 90
longitude:
type: number
format: double
example: -122.174762
description: Longitude of point.
minimum: -180
maximum: 180
SortQuery:
type: object
properties:
field:
type: array
description: The list of fields to sort the results.
items:
type: string
order:
type: array
description: The list of orders to sort the results. The element must be either
ASC or DESC.
items:
type: string
SpatialFilter:
type: object
properties:
field:
type: string
description: geo-point field in the index on which filtering will be performed.
Use GET schema API to find which fields supports spatial search.
byBoundingBox:
$ref: "#/components/schemas/ByBoundingBox"
byDistance:
$ref: "#/components/schemas/ByDistance"
byGeoPolygon:
$ref: "#/components/schemas/ByGeoPolygon"
AppError:
type: object
properties:
code:
type: integer
format: int32
reason:
type: string
message:
type: string
```
</details>
### Implementation details on Pagination Query API
First [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API usage requires a [PIT](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/point-in-time-api.html) id to be created ahead of time and supplied on the [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API call to Elasticsearch cluster. Pagination Query API should wrap both of these API calls in first pagination request.
If there are more than one page then [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API call will respond with PIT id of next page and sort values along with results. PIT id and sort values are required to fetch next page. Pagination Query API response's `nextCursor` attribute should be set to value that's a combination of both. PIT id is pretty long, it can be shortened & cached using [existing hashing function](https://community.opengroup.org/osdu/platform/system/search-service/-/blame/7b522a79df7b4c23fabe61e5026671c31fae876a/provider/search-azure/src/main/java/org/opengroup/osdu/search/provider/azure/provider/impl/ScrollQueryServiceImpl.java#L190) before returning response to end user. `nextCursor` attribute can then be set to: shortened(PID id) + base64.encode(sort value).
When Search receives next page request then pagination-query API will breakdown PID id and sort values by above mentioned mechanism and make next [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) call.
[Back to TOC](#TOC)
# Consequences
- Existing `query_with_cursor` API (POST /api/search/v2/query_with_cursor) should be deprecated.
- New Pagination Query API using [search_after](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html) API on Elasticsearch should be introduced.
- New Delete Pagination Query Cursor API should be implemented.
- Search service tutorial should be updated with:
- New APIs documentation
- Introduction of a 'Best Practices' section with following suggestions:
- Migrate users from query_with_cursor API to new pagination-query API
- Remind users to call `DELETE /api/search/v2/pagination-query-cursor` API to avoid overloading system if cursor is no longer in use or next page is not needed.
[Back to TOC](#TOC)https://community.opengroup.org/osdu/platform/system/search-service/-/issues/156query_with_cursor quota exhausts too easily2024-02-22T09:58:58ZAn Ngoquery_with_cursor quota exhausts too easilyWhen making a few query_with_cursor request to search service, it was too easy to reach ES scroll contexts quota (500 scroll contexts).
curl -X POST \
'/search/v2/query_with_cursor' \
--header 'accept: */*' \
--header 'data-parti...When making a few query_with_cursor request to search service, it was too easy to reach ES scroll contexts quota (500 scroll contexts).
curl -X POST \
'/search/v2/query_with_cursor' \
--header 'accept: */*' \
--header 'data-partition-id: <partitionid>' \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data-raw '{"kind": "*:*:*:*", "limit": 1}'
The above request returns a 429 error code on the third call.
```
{
"code": 429,
"reason": "Too many requests",
"message": "Too many cursor requests, please re-try after some time."
}
```https://community.opengroup.org/osdu/platform/system/search-service/-/issues/154Align the Search Service Code Base with OSDU Platform Development Principles.2024-01-28T16:10:10ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comAlign the Search Service Code Base with OSDU Platform Development Principles.# ADR: Move Code Duplicates from CSP Modules to the Core Module.
Enhance Search service maintenance, align with the future ElasticSearch 8 migration, and minimize the effort needed for introducing Community implementation by reducing co...# ADR: Move Code Duplicates from CSP Modules to the Core Module.
Enhance Search service maintenance, align with the future ElasticSearch 8 migration, and minimize the effort needed for introducing Community implementation by reducing code duplication in CSPs modules.
## Status
- [x] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
The Search service contains duplicated code for constructing Elasticsearch queries within CSP modules, in classes such as QueryBase.java and QueryServiceImpl.java. These redundancies add complexity to code maintenance without offering visible benefits. Query builders do not have CSP-specific code, additionally, differences have emerged in these classes over time:
https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/provider/search-azure/src/main/java/org/opengroup/osdu/search/provider/azure/provider/impl/QueryBase.java
https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/provider/search-aws/src/main/java/org/opengroup/osdu/search/provider/aws/provider/impl/QueryBase.java
https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/provider/search-gc/src/main/java/org/opengroup/osdu/search/provider/gcp/provider/impl/QueryBase.java
https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/provider/search-ibm/src/main/java/org/opengroup/osdu/search/provider/ibm/provider/impl/QueryBase.java?ref_type=heads
## Decision
Identify delta in search service across different providers. Prioritize the most advanced version if significant differences exist and move it to the Core module. For instance, we previously migrated the optimized geo query builders from the Azure provider to the core: https://community.opengroup.org/osdu/platform/system/search-service/-/merge_requests/556 Following the same principle, we can eliminate other existing duplicates.
## Rationale
Aside from the current increased cost and complexity of maintenance, we have at least two major tasks pending in the Search service:
- The migration to ElasticSearch 8 will require migrating all Elasticsearch query builders. Currently, the required effort will increase proportionally with the number of providers. https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/111
- For the Community Implementation of the Search service, selecting a version of Query Builders will be required. It could be a copy of the GC provider, which diverges from OSDU Development principles (like all providers currently), or a robust, reusable solution. https://gitlab.opengroup.org/osdu/pmc/community-implementation/-/issues/9
## Consequences
* Removal of code duplicates in provider modules.
* Introduction of a consolidated ElasticSearch query builder in the core module.
* Potential impact on features currently in development due to substantial codebase changes.
## Tradeoff Analysis
While this won't break API behavior, it could be seen as disruptive in the development process due to significant codebase changes. Tweaks and improvements made by CSPs to their modules might be overlooked during refactoring if not captured through integration testing.
## Alternatives and implications
An alternative to the current ADR involves relocating code duplicates to the Community Implementation module rather than the Core, designated for use in the Community implementation of the Search service. However, this would require developers to support five modules if new features are introduced or if migration to ElasticSearch 8 begins.M23 - Release 0.26Rustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/152ADR: Ability to get all the records of a given Persisted Collection from search2024-01-11T09:20:16ZJuilee PaluskarADR: Ability to get all the records of a given Persisted Collection from search## Status
* [x] Proposed
* [ ] Trialing
* [ ] Under review
* [ ] Approved
* [ ] Retired
## Context & Scope
A persisted collection can aggregate objects of different nature including master data, work-product-component, reference data....## Status
* [x] Proposed
* [ ] Trialing
* [ ] Under review
* [ ] Approved
* [ ] Retired
## Context & Scope
A persisted collection can aggregate objects of different nature including master data, work-product-component, reference data. It could contain collection of records of heterogenous kind. At a given point, MemberIDs field of PersistedCollection maintains list of objects which are part of the collection.
More can be read from this [schema](https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/Generated/work-product-component/PersistedCollection.1.2.0.json) .
Problem :
Today, there is no way to get all the records which belongs to a particular Persisted Collection. Today, user has to perform atleast 2 search queries to get the records of a Persisted Collection.
1st Query - To get the Persisted Collection record and retrieve record IDs are from MemeberID field.
2nd Query - To get the actual record from retrieved record Ids in the 1st query.
For the 2nd query, to get multiple records in 1 search query, user has to form a query with OR operator. E.g.
{
"Query" : recordId-1 **OR** recordId-2 **OR** recordId-3 .. recordId-1000.
}
Here ElasticSearch has limitation of max usage of **OR** conditions in 1 query.
So if a PersistedCollection contains more than 1000 records , user has to invoke multiple search queries to get all the records.
## Possible Solution
One of the possible solution to address this requirement could be adding Persisted Collection record id in the record's data. So whenever records get added to the Persisted Collection , record's data should be updated with the information of Persisted Collection id.
This could be done by listening to record change event for PersistedCollection kind .
## Consequences
* This will help users to get the records of Persisted Collection in a single go.
* This will help users to get the records to which he/she has access.
* This will help users to form queries to get desire records from PeristedCollection such as “Give all the records of a persisted collection where data.\<someproperty\> is \<xyz\>“ in one go.
* This will help users to make filters based on different objects in the collection.https://community.opengroup.org/osdu/platform/system/search-service/-/issues/142Search service does not return all null objects when the key object datatype ...2023-12-20T10:39:59ZNaufal Mohamed NooriSearch service does not return all null objects when the key object datatype has value not equal to stringSearch service should return all object key and return them null if the object is not populated during ingestion/storage insertion. However, we found that if keys have non-string value type from schema definition it will return null from...Search service should return all object key and return them null if the object is not populated during ingestion/storage insertion. However, we found that if keys have non-string value type from schema definition it will return null from the search - as opposed to when the object has value type string where non populated key will return 'null'
Example, I have ingested the following payload in R3M20 AWS preship:
`{
"runId": "{{$guid}}",
"executionContext": {
"acl": {
"viewers": [
"data.default.viewers@osdu.example.com"
],
"owners": [
"data.default.owners@osdu.example.com"
]
},
"legal": {
"legaltags": [
"osdu-public-usa-dataset"
],
"otherRelevantDataCountries": [
"US"
]
},
"Payload": {
"AppKey": "test-app",
"data-partition-id": "osdu"
},
"manifest": {
"kind": "osdu:wks:Manifest:1.0.0",
"Data": {
"WorkProductComponents": [
{
"id": "osdu:work-product-component--SeismicTraceData:TEST_ISSUE_1",
"kind": "osdu:wks:work-product-component--SeismicTraceData:1.4.0",
"acl": {
"viewers": [
"data.default.viewers@osdu.example.com"
],
"owners": [
"data.default.owners@osdu.example.com"
]
},
"legal": {
"legaltags": [
"osdu-public-usa-dataset"
],
"otherRelevantDataCountries": [
"US"
]
},
"data": {
"Name": "TESTT_ISSUE",
"StartTime": 0,
"EndTime": 10000
},
"meta": [
{
"kind": "Unit",
"name": "ms",
"persistableReference": "{\"abcd\":{\"a\":0.0,\"b\":0.001,\"c\":1.0,\"d\":0.0},\"symbol\":\"ms\",\"baseMeasurement\":{\"ancestry\":\"T\",\"type\":\"UM\"},\"type\":\"UAD\"}",
"propertyNames": [
"StartTime",
"EndTime",
"SampleCount"
],
"unitOfMeasureID": "osdu:reference-data--UnitOfMeasure:ms:"
}
]
}
]
}
}
}
}`
During search query, the data.SampleCount and others non-string keys are not shown up from search return. Look at the sample below traceDomainUoM (type: string) is visible as null but the traceLength (type: number) is not visible:
`{
"results": [
{
"data": {
"SpatialArea.QuantitativeAccuracyBandID": null,
"VirtualProperties.DefaultLocation.QuantitativeAccuracyBandID": null,
"LiveTraceOutline.CoordinateQualityCheckPerformedBy": null,
"SpatialArea.SpatialParameterTypeID": null,
"Difference": null,
"ResourceCurationStatus": null,
"SpatialArea.SpatialGeometryTypeID": null,
"SortOrderID": null,
"IsExtendedLoad": null,
"Name": "TESTT_ISSUE",
"SeismicFilteringTypeID": null,
"VirtualProperties.DefaultName": "TESTT_ISSUE",
"VirtualProperties.DefaultLocation.CoordinateQualityCheckPerformedBy": null,
"ResourceSecurityClassification": null,
"VerticalMeasurementTypeID": null,
"SeismicStackingTypeID": null,
"ExistenceKind": null,
"ProcessingProjectID": null,
"SeismicDomainTypeID": null,
"Preferred2DInterpretationSetID": null,
"HorizontalCRSID": null,
"SeismicAttributeTypeID": null,
"BinGridID": null,
"SeismicProcessingStageTypeID": null,
"StartTime": 0.0,
"LiveTraceOutline.SpatialParameterTypeID": null,
"SpatialArea.QualitativeSpatialAccuracyTypeID": null,
"SpatialPoint.SpatialGeometryTypeID": null,
"LiveTraceOutline.QuantitativeAccuracyBandID": null,
"LiveTraceOutline.SpatialGeometryTypeID": null,
"IsDiscoverable": null,
"SeismicWaveTypeID": null,
"Precision.WordFormat": null,
"VirtualProperties.DefaultLocation.QualitativeSpatialAccuracyTypeID": null,
"SubmitterName": null,
"TraceDomainUOM": null,
"SeismicPhaseID": null,
"SpatialPoint.QualitativeSpatialAccuracyTypeID": null,
"PrincipalAcquisitionProjectID": null,
"GatherTypeID": null,
"Description": null,
"Phase": null,
"EndTime": 10.0,
"TimeLapse.TimeSeriesID": null,
"ResourceLifecycleStatus": null,
"SeismicTraceDataDimensionalityTypeID": null,
"TechnicalAssuranceID": null,
"VirtualProperties.DefaultLocation.SpatialGeometryTypeID": null,
"Source": null,
"SeismicLineGeometryID": null,
"LiveTraceOutline.QualitativeSpatialAccuracyTypeID": null,
"SpatialPoint.CoordinateQualityCheckPerformedBy": null,
"TraceRelationFileID": null,
"Polarity": null,
"SpatialPoint.SpatialParameterTypeID": null,
"SpatialPoint.QuantitativeAccuracyBandID": null,
"SpatialArea.CoordinateQualityCheckPerformedBy": null,
"SeismicPolarityID": null,
"Seismic2DName": null,
"VirtualProperties.DefaultLocation.SpatialParameterTypeID": null,
"ResourceHomeRegionID": null,
"Preferred3DInterpretationSetID": null,
"SeismicMigrationTypeID": null
},
"kind": "osdu:wks:work-product-component--SeismicTraceData:1.4.0",
"source": "wks",
"acl": {
"viewers": [
"data.default.viewers@osdu.example.com"
],
"owners": [
"data.default.owners@osdu.example.com"
]
},
"type": "work-product-component--SeismicTraceData",
"version": 1703067759404129,
"tags": {
"normalizedKind": "osdu:wks:work-product-component--SeismicTraceData:1"
},
"modifyUser": "admin-main@testing.com",
"modifyTime": "2023-12-20T10:22:40.387Z",
"createTime": "2023-12-20T10:17:21.256Z",
"authority": "osdu",
"namespace": "osdu:wks",
"legal": {
"legaltags": [
"osdu-public-usa-dataset"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
},
"createUser": "serviceprincipal-main@testing.com",
"id": "osdu:work-product-component--SeismicTraceData:TEST_ISSUE_1"
}
],
"aggregations": null,
"totalCount": 1
}`
We do hope the search index will return consistently regardless the type of keys input either string, boolean, number or integer.
cc @debasiscM20 - Release 0.23https://community.opengroup.org/osdu/platform/system/search-service/-/issues/141What is the best way to figure out "unit of measure" of any specific field fr...2023-12-13T06:49:18ZDebasis ChatterjeeWhat is the best way to figure out "unit of measure" of any specific field from Search response?Please see this test case from an earlier release.
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M16/Test_Plan_Results_M16/Manifest_Ingestion/M16-AWS-Manifest-Ingestion-Unit-convert-Debasis.txt
Here field Cab...Please see this test case from an earlier release.
https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M16/Test_Plan_Results_M16/Manifest_Ingestion/M16-AWS-Manifest-Ingestion-Unit-convert-Debasis.txt
Here field CableLength value was 6000 ft. in Ingestion payload (JSON file).
We added this field in meta block and wanted conversion from ft to meter.
Hence Search response shows correctly converted value - 1828.8 meters. (as expected).
This becomes clear when we see initial JSON payload by using Storage service and later when we retrieve the record by using Search service.
But the question is - if someone looks at Search service alone, what is the clue?
- To know that a field has undergone conversion?
- And to know what it has been converted to and what is the matching unit of measure for its current value?
Checking Schema service, I can find out this field is using "length" unit of measure.
[schema-SAS.txt](/uploads/6552b357ddc855c7120e3abc63d7c155/schema-SAS.txt)
```
"CableLength": {
"description": "Total length of receiver array",
"x-osdu-frame-of-reference": "UOM:length",
"type": "number"
},
```
From Reference data UnitofMeasure, I can also find out Base unit of measure for "length" is "metre".
IsBaseUnit=true. UnitDimensioName="length"
From Search, I may include the field "index" to get more information.
That trick is handy when something goes wrong with indexing.
But otherwise it shows status=200.
How do we find out if the user opted to leave values in original unit (foot) and did not bother to convert to SI unit (meter)?
Copying to Mark Chance ( @Java1Guy ) as he is currently working on Search service enhancements
Also copying to @nthakur and @gehrmann for their inputs.https://community.opengroup.org/osdu/platform/system/search-service/-/issues/140Tutorial: Search by kind Guidance2024-01-08T12:17:56ZThomas Gehrmann [slb]Tutorial: Search by kind Guidance# [Query by kind](https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/docs/tutorial/SearchService.md?ref_type=heads#query-by-kind)
The tutorial promotes searching for specific **_versions_** of schemas, whi...# [Query by kind](https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/docs/tutorial/SearchService.md?ref_type=heads#query-by-kind)
The tutorial promotes searching for specific **_versions_** of schemas, which is not a good idea. In recent milestones the number of minor schema versions as well as patch versions have grown considerably.
* The tutorial should recommend wildcards for minor and patch versions.
* Using specific schema versions in query by `kind` will cause serious trouble when data records are schema-migrated, updated or newly ingested using, e.g., the using the [preferred schema version recommendation (Schema Usage Guide)](https://community.opengroup.org/osdu/data/data-definitions/-/blob/master/Guides/Chapters/93-OSDU-Schemas.md#appendix-d5-schema-version-managementconfiguration).
CC @nthakur, @chadM23 - Release 0.26Thomas Gehrmann [slb]Thomas Gehrmann [slb]https://community.opengroup.org/osdu/platform/system/search-service/-/issues/138Search APIs don't return content-type in response headers2023-11-08T14:04:16ZShane HutchinsSearch APIs don't return content-type in response headersMinor issue:
Most search APIs don't return the Content-Type in the response headers
If does return it in doesn't match or is missing in the openapi.jsonMinor issue:
Most search APIs don't return the Content-Type in the response headers
If does return it in doesn't match or is missing in the openapi.jsonhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/137Search with offset returns duplicates2023-10-05T10:18:24ZBert KampesSearch with offset returns duplicatesSee attached [Search_query_API_issue.docx](/uploads/976b5aca3cdd625f1fa59822b6319415/Search_query_API_issue.docx).
(from attachment)
Search query API issue:
Search query API is having a limit of retrieving only 1000 records. To retrieve...See attached [Search_query_API_issue.docx](/uploads/976b5aca3cdd625f1fa59822b6319415/Search_query_API_issue.docx).
(from attachment)
Search query API issue:
Search query API is having a limit of retrieving only 1000 records. To retrieve the all the records from the DB we need to call the search API multiple times until we get all the records.
When Search API is invoked for the first time that is for first iteration it will get 1000 records.
For the second iteration we will pass the offset value. When we are passing the offset value, we are indicating it to retrieve the data from that point.
And the iteration goes on until all the records are retrieved. And the offset value is increased for every subsequent iteration.
But we observed that the Search query API is not working as intended to work with the offset value. The order of retrieval of records is not maintained.
It is retrieving the records which are already returned. So this is giving the duplicate records.
For Example, below when we are trying to retrieve the CRS data using the Search query API...https://community.opengroup.org/osdu/platform/system/search-service/-/issues/136OpenAPI documentation should specify array of string or string instead of typ...2023-10-05T10:53:40ZHåkon TønnessenOpenAPI documentation should specify array of string or string instead of typeless schema for `kind`Current documentation does not specify the possible types for the property `kind` in `CursorQueryRequest`and `QueryRequests`:
Currently:
```
CursorQueryRequest:
description: Json object to query the Search API
type: object
...Current documentation does not specify the possible types for the property `kind` in `CursorQueryRequest`and `QueryRequests`:
Currently:
```
CursorQueryRequest:
description: Json object to query the Search API
type: object
required:
- kind
properties:
cursor:
.....
kind:
type: object
description: The kind of the record to query e.g. "tenant1:test:well:1.0.0" or "tenant1:test:well:1.0.0,tenant1:test:well:2.0.0" or ["tenant1:test:well:1.0.0", "tenant1:test:well:2.0.0"].
```
This causes issue when creating data models based on the OpenAPI documentation, as the typeless schemas will be interpreted as an dictionary type, but description specifies that both string and array of strings are valid parameters.
By specifying types correctly using `oneOf` instead, gives a correct specification:
```
CursorQueryRequest:
description: Json object to query the Search API
type: object
required:
- kind
properties:
cursor:
.....
kind:
type: object
additionalProperties:
oneOf:
- type: string
- type: array
items:
type: string
description: The kind of the record to query e.g. "tenant1:test:well:1.0.0" or "tenant1:test:well:1.0.0,tenant1:test:well:2.0.0" or ["tenant1:test:well:1.0.0", "tenant1:test:well:2.0.0"].
```
This makes the models unambiguous.
Affects `CursorQueryRequest`and `QueryRequests`, as these needs the `kind` parameterhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/135ADR Provide suggestions for auto-complete of input2024-01-15T11:56:08ZMark ChanceADR Provide suggestions for auto-complete of input# ADR: Autocomplete
<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [x] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Shell application developer stakeholders want to provide to their users the functi...# ADR: Autocomplete
<a name="TOC"></a>
[[_TOC_]]
# Status
- [x] Proposed
- [x] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
# Background
Shell application developer stakeholders want to provide to their users the functionality to provide auto-complete suggestions based on partial input.
# Context & Scope
Based on words occurring in OSDU platform records, a comparison is made to all text tokens occurring in all fields of a record. For this case we propose using bagOfWords described in indexer [ADR](https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/113)
[Back to TOC](#TOC)
## Requirements
The partial input is passed to the search service and a list of suggestions is returned.
To be useful, the response time must be under 2 seconds.
[Back to TOC](#TOC)
# Tradeoff Analysis
[Back to TOC](#TOC)
# Proposed solution
The search query json will support this syntax:
```json
{
"suggestPhrase": "united"
}
```
Which would return something of the form:
```json
{
"phraseSuggestions": [
"United States",
"United States therm",
"United Kingdom",
"United Kingdom British thermal unit",
"United Kingdom term",
"United Kingdom nautical mile",
]
}
```
[Back to TOC](#TOC)
# Change Management
* Operators may need to execute reindex with force_clean=true action on indices to enable this feature.
# Decision
# Consequences
* The search code changes will not impact any existing queries or functionality since this is a new field.
[Back to TOC](#TOC)
#EOF.M23 - Release 0.26Mark ChanceMark Chancehttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/134Search should not return 404 in case there are no matching data in Elasticsearch2023-11-08T14:07:37ZDenis Karpenok (EPAM)Search should not return 404 in case there are no matching data in Elasticsearch**The expected result:**
- When no data matches the query response is 200 OK with an empty list.
**Actual results are:**
- Inconsistent, sometimes it's 200 OK sometimes it's 400.
**Reason:**
- Not all requests to ElasticSearch have...**The expected result:**
- When no data matches the query response is 200 OK with an empty list.
**Actual results are:**
- Inconsistent, sometimes it's 200 OK sometimes it's 400.
**Reason:**
- Not all requests to ElasticSearch have parameters to ignore user errors, usually, those are preliminary requests to get details for further search queries, for example: https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/search-core/src/main/java/org/opengroup/osdu/search/service/FieldMappingTypeService.java#L49
**Solution:**
- Suppress all 400 errors from Elasticsearch and respond to the end user only with 200 OK.
**Pros:**
- More consistent workflow for client applications.
- Reduced error handling for client applications.
More details are in the attached CSV files:
[test_results_2023-08-29_11-34-31.csv](/uploads/03bf18c852387f4da493aa13b97ad5d3/test_results_2023-08-29_11-34-31.csv)
[test_results_2023-08-29_11-51-20.csv](/uploads/6071b35ea688e57bdf24112198a9ddd7/test_results_2023-08-29_11-51-20.csv)https://community.opengroup.org/osdu/platform/system/search-service/-/issues/133Elasticsearch licensing2024-01-18T07:50:02ZChad LeongElasticsearch licensing# Problem Statement
Currently, Search service is using Elasticsearch [7.8.1](https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/pom.xml?ref_type=heads#L33). There is a need to upgrade the version to provid...# Problem Statement
Currently, Search service is using Elasticsearch [7.8.1](https://community.opengroup.org/osdu/platform/system/search-service/-/blob/master/pom.xml?ref_type=heads#L33). There is a need to upgrade the version to provide stability, features and performance improvement.
Specifically following the release of version 7.10.2 https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-core/7.10.2 , Elastic has since transitioned its licensing from the Apache 2.0 license to the Server Side Public License (SSPL) for any future versions https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-core.
OSDU software needs to be licensed using Apache 2.0. This change has raised concerns, particularly regarding compatibility issues with client bindings and providing future updates to Elasticsearch.
## Impact
There are 2 components to the search - Elastic client bindings and server.
- Client bindings are integral components of applications that facilitate seamless communication with our Elastic Search Service. These bindings have traditionally been Apache 2.0 compatible. The shift to SSPL raises compatibility concerns, potentially preventing the upgrade of client bindings.
- The Elastic server itself is used as a tool, so we don’t need to worry about Apache compatibility. Server-side upgrades are possible but may encounter a future technical barrier without client-binding upgrades.
## Objective
We need to address this licensing challenge and find an alternative that allows for a smooth transition. We are actively exploring options for an elastic alternative that can bridge the gap between client bindings and server upgrades.
**An option №1** is https://opensearch.org/docs/latest/clients/java/
- https://aws.amazon.com/blogs/opensource/keeping-clients-of-opensearch-and-elasticsearch-compatible-with-open-source/
Pros:
- OpenSearch is an ElasticSearch fork, and fully compatible with v 7.10 see https://opensearch.org/faq/#q1.8. Thus refactoring should be more or less straightforward.
- Easier to preserve existing features.
- It's possible to change clients in Services and keep ElastSearch as a backend server.
Cons:
- Following-up releases do not guarantee compatibility with ElasticSearch API: https://opensearch.org/faq/#q1.9
Action items:
- Potentially could bind CSPs to ElasticSearch server v 7.10 or force them to switch to OpenSearch server.
- Switch Indexer and Search to use OpenSearch clients.
**Option №2** is an Elasticsearch client with an Apache license https://github.com/elastic/elasticsearch-java/
Pros:
- Possible to keep Elasticsearch as a backend.
- Later we could migrate to Elasticsearch 8.
Cons:
- Could require a bit more thorough migration for Search and Indexer, unlike OpenSearch. Since it's a different lib with different interfaces, we may need to rewrite a lot of code. In the meantime, OpenSearch has a fork of High-level-rest-client https://opensearch.org/docs/latest/clients/java-rest-high-level/ which could simplify migration to just swapping imports.
- Additionally, we should be aware that the Elasticsearch server's licensing could still pose an issue.
Action items:
- Migrate Indexer and Search to use Elasticsearch Apache client.
## Decisions
Option 2 seems to be a better long-term solution with the possibility of keeping Elasticsearch as a backend. A separate migration strategy has been written here https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/111https://community.opengroup.org/osdu/platform/system/search-service/-/issues/132Follow-up from "Add filter to nested sort"2023-08-17T16:07:05ZMark ChanceFollow-up from "Add filter to nested sort"The following discussion from !535 should be addressed:
- [ ] @nthakur started a [discussion](https://community.opengroup.org/osdu/platform/system/search-service/-/merge_requests/535#note_241458): (+2 comments)
> Will filter conte...The following discussion from !535 should be addressed:
- [ ] @nthakur started a [discussion](https://community.opengroup.org/osdu/platform/system/search-service/-/merge_requests/535#note_241458): (+2 comments)
> Will filter context work for non-nested scenario? If it does, can you please update non-nested section as well? If it does not, then we should add this in limitation documentation.M20 - Release 0.23Mark ChanceMark Chancehttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/130Search Service does not work with cursors2024-02-22T09:58:57ZRiabokon Stanislav(EPAM)[GCP]Search Service does not work with cursorsThis issue was observed when the GC team was running various requests on Search Service.
for example,
```
curl --location 'https://community.gcp.gnrg-osdu.projects.epam.com/api/search/v2/query_with_cursor' \
--header 'Content-Type: appl...This issue was observed when the GC team was running various requests on Search Service.
for example,
```
curl --location 'https://community.gcp.gnrg-osdu.projects.epam.com/api/search/v2/query_with_cursor' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: osdu' \
--header 'accept: application/json' \
--header 'Authorization: Bearer ey' \
--data '{
"kind": "*:*:*:*",
"query": "data.DatasetProperties.FileSourceInfo.PreloadFilePath: (\"s3://osdu-seismic-test-data*\")",
"trackTotalCount": true
}'
```
with an answer
```
{
"cursor": null,
"results": [],
"totalCount": 0
}
```
Investigation:
**Search Service** will create a request on ElasticSearch:
`POST /*-*-*-*,-.*/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=true&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&scroll=90s&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true`
A parameter **scroll = 90s** means we will use Scroll API to use a cursor with **a time-life = 90 seconds**.
However, we will create a cursor for every indice and can get an error for an indice:
`Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.`
After a while, we decided to investigate this issue deeper.
When we check node stats with the next request:
https://gc_elastic_search:9243/_nodes/stats/indices/search
answer was
```
"indices": {
"search": {
"open_contexts": 1,
"query_total": 15271488,
"query_time_in_millis": 9385974,
"query_current": 0,
"fetch_total": 9567767,
"fetch_time_in_millis": 590770,
"fetch_current": 0,
"scroll_total": 4399252,
"scroll_time_in_millis": 7768131243,
"scroll_current": 1,
"suggest_total": 0,
"suggest_time_in_millis": 0,
"suggest_current": 0
}
}
```
As we can see, ElasticSearch has **1 scroll_current**.
Let's run our request again
```
{
"kind": "*:*:*:*",
"query": "data.DatasetProperties.FileSourceInfo.PreloadFilePath: (\"s3://osdu-seismic-test-data*\")",
"trackTotalCount": true
}
```
The answer from node stats was
```
"indices": {
"search": {
"open_contexts": 1193,
"query_total": 15272901,
"query_time_in_millis": 9386238,
"query_current": 0,
"fetch_total": 9567932,
"fetch_time_in_millis": 590779,
"fetch_current": 0,
"scroll_total": 4399329,
"scroll_time_in_millis": 7768231132,
"scroll_current": 1193,
"suggest_total": 0,
"suggest_time_in_millis": 0,
"suggest_current": 0
}
}
```
We will get **"scroll_current": 1193**. Thus, thanks to our request, we will create 1193 cursors with the same ID for every indice.
Solution:
- try to avoid such requests when we want to use search_with_cursors.
- According with an official Elasticsearch documentation, we have to use
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/paginate-search-results.html#search-after instead of Scroll API.
`We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).`
More details: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/scroll-api.html.
In our case, we have to maintain **search_after** parameter in Search Service.
- Run request
```
curl -i -X PUT \
-H "Authorization:Basic ****" \
-H "data-partition-id:osdu" \
-H "Content-Type:application/json" \
-d \
'\{"persistent" : { "search.max_open_scroll_context": 5000 }
,
"transient":
{ "search.max_open_scroll_context": 5000 }
}' \
'https://elastic_search:9243/_cluster/settings'
```
to increase max_open_scroll_context.https://community.opengroup.org/osdu/platform/system/search-service/-/issues/123Search service does not ignore unmapped fields (records without spatial attri...2023-03-13T11:02:45ZAn NgoSearch service does not ignore unmapped fields (records without spatial attributes are returned regardless)The following request returns all records in that kinds I can access, but none of them actually has SpatialLocation attribute.
```
curl --location '<baseUrl>/search/v2/query' \
--header 'data-partition-id: partitionID' \
--header 'Auth...The following request returns all records in that kinds I can access, but none of them actually has SpatialLocation attribute.
```
curl --location '<baseUrl>/search/v2/query' \
--header 'data-partition-id: partitionID' \
--header 'Authorization: Bearer ' \
--header 'Content-Type: application/json' \
--data '{
"kind": "osdu:test:Hello:1.0.0",
"query": "*",
"spatialFilter": {
"field": "data.SpatialLocation.Wgs84Coordinates",
"byIntersection": {
"polygons": [
{
"points": [
{
"longitude": -180,
"latitude": 90
},
{
"longitude": 180,
"latitude": 90
},
{
"longitude": 180,
"latitude": -90
},
{
"longitude": -180,
"latitude": -90
},
{
"longitude": -180,
"latitude": 90
}
]
}
]
}
}
}'
```
However, the following request returns 0 record which is expected.
```
curl --location '<baseUrl>/search/v2/query' \
--header 'data-partition-id: partitionID' \
--header 'Authorization: Bearer ' \
--header 'Content-Type: application/json' \
--data '{
"kind": "osdu:test:Hello:1.0.0",
"query": "_exists_:data.SpatialLocation"
}'
```
**Fix**: Ignore Unmapped fields in Elastic SearchM17 - Release 0.20Neelesh ThakurNeelesh Thakurhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/124OSDU Search Endpoint Response2023-03-13T09:52:27ZRex Von Brixon Apa-apOSDU Search Endpoint ResponseIf the ingested property is of type "string" and no record is ingested, search returns a response of **None** on that property.
However if the ingested property is non-string and no record is ingested, search does not return anything for...If the ingested property is of type "string" and no record is ingested, search returns a response of **None** on that property.
However if the ingested property is non-string and no record is ingested, search does not return anything for that property.
We are expecting the property to reflect with response of **None** even for non-strings.
Documentation of the test: https://community.opengroup.org/osdu/platform/pre-shipping/-/blob/main/R3-M15/Test_Plan_Results_M15/Manifest_Ingestion/M15_AWS_Manifest_Ingestion_custom-schema_Rex.docxhttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/121Change default behaviour of spatial queries to use 'intersects' operator2023-03-02T08:48:44ZRenaud PetitChange default behaviour of spatial queries to use 'intersects' operatorWhile I agree with the need to precisely control the type of operator used in spatial queries (see https://community.opengroup.org/osdu/platform/system/search-service/-/issues/80), I realise I could require a large amount of work. I ther...While I agree with the need to precisely control the type of operator used in spatial queries (see https://community.opengroup.org/osdu/platform/system/search-service/-/issues/80), I realise I could require a large amount of work. I therefore suggest that we change the current implementation to use the 'intersects' operator (rather than the 'contains' as it seems to be the case).
Intersection appears to be a more useful and natural operator considering the nature of the data we are dealing with in OSDU.
Consider the case where a user requests geopolitical entities, seismic lines, seismic surveys, wells, bore trajectories, etc. within a polygon for a map display. One would expect that objects that are only partially inside the polygon to be displayed. Otherwise, it pre-supposes that the user already knows the geographical nature of what is being looked for.Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/system/search-service/-/issues/119Improve bad response error messages2023-05-25T09:06:52ZMichaelImprove bad response error messagesA generic "Invalid parameters were given on search request" is given when the records requested exceed the 10,000 limit. For example the following request:
```
curl --location --request POST 'https://r3m15.preshiptesting.osdu.aws/api/se...A generic "Invalid parameters were given on search request" is given when the records requested exceed the 10,000 limit. For example the following request:
```
curl --location --request POST 'https://r3m15.preshiptesting.osdu.aws/api/search/v2/query' \
--header 'Content-Type: application/json' \
--header 'data-partition-id: osdu' \
--header 'Authorization: Bearer ...' \
--data-raw '{
"kind": "osdu:wks:master-data--Well:*",
"returnedFields": [
"id",
"kind",
"data.FacilityID",
"data.FacilityName",
"data.SpatialLocation.Wgs84Coordinates"
],
"sort": {
"field": [
"id"
],
"order": [
"ASC"
]
},
"limit": 1000,
"offset": 9001
}'
```
Response:
`{
"code": 400,
"reason": "Bad Request",
"message": "Invalid parameters were given on search request"
}`
For this case it would be good to have a specific error message like "limit and offset parameters exceeds maximum limit".