Schema issueshttps://community.opengroup.org/osdu/platform/system/schema-service/-/issues2022-09-29T13:41:56Zhttps://community.opengroup.org/osdu/platform/system/schema-service/-/issues/39Add APIs to perform syntax and content validation2022-09-29T13:41:56ZAlan HensonAdd APIs to perform syntax and content validationIngestion workflows perform two types of validation as it relates to schema definitions:
- Is the content syntactically correct according to the specified `kind`'s schema definition
- Is the intent of content correct (e.g., does cited da...Ingestion workflows perform two types of validation as it relates to schema definitions:
- Is the content syntactically correct according to the specified `kind`'s schema definition
- Is the intent of content correct (e.g., does cited data exist)
The implementation of these validations currently exists within Python code as libraries or Airflow DAG operators. This logic is better suited as a set of API endpoints that can perform this type of validation enabling external toolsets easy access to this functionality. The types of validation that occur can be found within this [guide](https://community.opengroup.org/osdu/platform/testing/-/blob/master/R3%20Pre%20Ship/Manifest%20Ingestion/R3%20Manifest%20Ingestion%20Quick%20Test%20Guide.docx) under the `How the Manifest Ingestion workflow functions` section.
The validation logic for content adhering to schema definitions should exist as synchronous API endpoints.https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/96Add /validate endpoint to help non-core-platform developers2022-09-29T13:41:07ZEric SchoenAdd /validate endpoint to help non-core-platform developersOSDU application developers working outside of the OSDU core platform do not necessarily have access to OSDU's schema toolchain to ensure that their schemas are correct, consistent with OSDU abstract types, and usable by the indexer.
...OSDU application developers working outside of the OSDU core platform do not necessarily have access to OSDU's schema toolchain to ensure that their schemas are correct, consistent with OSDU abstract types, and usable by the indexer.
- When working with DEVELOPMENT status schema, there is no validation enforced by the schema service in the POST or PUT /schema endpoint. This allows an invalid schema to be installed into OSDU.
- Subsequently, the storage service PUT /records endpoint will accept data for that (possibly invalid) schema kind without complaint.
- The application is not aware that the data can't be indexed by the indexer due to invalid schema, and an attempt to reindex data of the invalid schema kind will produce a 500 error from the indexer with no explanation.
If the schema service provided a /validate endpoint that is effectively a dry-run for the POST/PUT /schema endpoint, a lot of uncertainty could be reduced, and lost development time from simple errors could be saved.https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/99Poor performance of schema info list endpoint and uniqueness check2022-08-24T11:00:46ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comPoor performance of schema info list endpoint and uniqueness checkPerformance issue related to get schema info list request and uniqueness check in the schema creation process:
~~~
curl --location --request GET 'localhost:8080/api/schema-service/v1/schema?authority=SchemaSanityTest' \
--header 'Data-P...Performance issue related to get schema info list request and uniqueness check in the schema creation process:
~~~
curl --location --request GET 'localhost:8080/api/schema-service/v1/schema?authority=SchemaSanityTest' \
--header 'Data-Partition-Id: osdu'
~~~
This request can use the `offset` and `limit` parameters, it is ok when these parameters are used at the data access layer,
but in the Schema service they were used at the core level by design:
https://community.opengroup.org/osdu/platform/system/schema-service/-/blob/master/schema-core/src/main/java/org/opengroup/osdu/schema/service/serviceimpl/SchemaService.java#L305
Also, this logic is used during schema creation, the same methods used to verify schema uniqueness and whether breaking changes are present or not.
This leads to loading a lot of unwanted data, for example, the query presented in the example will fetch over 6500 schema information from the GCP dev env, but by default, they will be discarded in the core service and only 100 records will be returned in the response.
Previously issues were spotted at GCP and Azure envs, to fix GCP we manually delete schemas created by IT's, my guess is that the Azure team does the same:
https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/76
https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/70
Suggestion for the fix is to pass `limit` and `offset` parameters to the provider level and use them directly.Andrei Dalhikh [EPAM/GC]Andrei Dalhikh [EPAM/GC]https://community.opengroup.org/osdu/platform/system/schema-service/-/issues/108Schema service must not allow creation of schema with different case2022-08-23T15:53:06ZNeelesh ThakurSchema service must not allow creation of schema with different caseSchema service uses `kind` as an identifier for Schema for a data-set. `kind` with casing difference usually belongs to same data-set and don't have any other notion. This creates confusion for the end-user consuming records from Search ...Schema service uses `kind` as an identifier for Schema for a data-set. `kind` with casing difference usually belongs to same data-set and don't have any other notion. This creates confusion for the end-user consuming records from Search service with different case. Moreover consumption service like Search uses kind as index name. It's backend (Elasticsearch) do not honor casing for index names thus creating issues for index creation.
Data Definition does not provide any rules for kind casing and delegates this governance to Schema service. It should not allow creation of schema with different case for kind.
Related Search service issue: [94](https://community.opengroup.org/osdu/platform/system/search-service/-/issues/94)