Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
I
Indexer
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
OSDU
OSDU Data Platform
System
Indexer
Commits
3df66f9d
Commit
3df66f9d
authored
8 months ago
by
Zhibin Mai
Browse files
Options
Downloads
Patches
Plain Diff
Update preview feature doc
parent
20c6bb26
No related branches found
Branches containing commit
No related tags found
Tags containing commit
1 merge request
!808
Search text with special characters '_' and '.'
Pipeline
#278967
failed
8 months ago
Stage: review
Stage: build
Stage: publish
Stage: deploy
Changes
1
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
docs/docs/PreviewFeatures.md
+33
-0
33 additions, 0 deletions
docs/docs/PreviewFeatures.md
with
33 additions
and
0 deletions
docs/docs/PreviewFeatures.md
+
33
−
0
View file @
3df66f9d
...
...
@@ -36,6 +36,39 @@ the Partition Service is applied to the solution. Here is an example to enable t
If the property "index-augmenter-enabled" is not created or the property value is set to "false" (String type) in the
given data partition, the configurations defined as type IndexPropertyPathConfiguration will be ignored and index extension will be disabled.
## Search text with special characters '_' and '.'
OSDU indexer and search use Elasticsearch default analyzer (or called standard analyzer) to analyzes the unstructured
text when they are indexed and searched. Due to the way Elasticsearch standard analzyer analyzes unstructured text,
it is very difficult if not impossible to perform certain high-value searches on unstructured content. For example,
users want to search for a file with file name
`1-ABC_Seismic_Report.pdf`
, it is impossible to use one or two keywords
in the file name like "abc", "seismic", "report" to search the file or pdf extension to find search all pdf files.
User can't even use wildcard like
`*seismic*`
to search the file as wildcard in prefix is not supported. The user would
have to search using exact match or at a minimum ABC_Seismic
*
if they want to use wildcards.
In the
[
ADR
](
doc:https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/186
)
, we propose a
change to extend the Elasticsearch Standard Analyzer to process two additional special characters as word delimiter:
-
underscore
`_`
-
dot
`.`
. It will be handled like character
`,`
. Please note that Elasticsearch Standard Analyzer does not take the
`,`
as word delimiter if it is part of number string, e.g.
`1,663m`
. In this proposal, the
`.`
will be processed in the
similar way, e.g.
`-999.25`
or
`10.88`
in which
`.`
won't be treated as word delimiter.
In order to reduce risks (e.g. work interruption) on re-indexing, we will manage this solution with a feature flag that
is set by the Partition Service. Here is an example to enable this feature by setting the property
"custom-index-analyzer-enabled" in a given data partition:
```
{
"custom-index-analyzer-enabled": {
"sensitive": false,
"value": "true"
}
}
```
If the property "custom-index-analyzer-enabled" is not created or the property value is set to "false" (String type) in the
given data partition, the default index analyzer will be applied to indexing and search.
## Index AsIngestedCoordinates
Source:
[
issue 95
](
https://community.opengroup.org/osdu/platform/system/indexer-service/-/issues/95
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment