Investigate use of whitespace analyzer for id fields
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Searchlight |
New
|
Medium
|
Steve McLellan |
Bug Description
We currently mark a large number of fields as not_analyzed so that they can be used for exact matches in 'term' queries. In retrospect, it would have made more sense to leave them as analyzed (so that query_string and match queries work more reliably) but change the analyzer so that dashes aren't treated as end-of-token characters [1].
The 'Standard' default analyzer [2] splits on tokens suitable for European languages; there's a whitespace analyzer [3] that tokenizes only on whitespace and is more suitable for our UUID columns; we might combine it with a lower case filter.
This should make searching less confusing since it will abstract some of the indexing details away (right now you can get weird partial matches searching for IDs - try a query string for an id, but change some of the characters).
[1] https:/
[2] https:/
[3] https:/
Changed in searchlight: | |
importance: | Undecided → Medium |
Changed in searchlight: | |
assignee: | nobody → Steve McLellan (sjmc7) |