Task #3164 (new)
Opened 14 years ago
Last modified 5 years ago
BUG: Searching returns no results for leading wildcard term — at Version 3
Reported by: | atarkowska | Owned by: | jamoore |
---|---|---|---|
Priority: | blocker | Milestone: | OMERO-Beta4.2.1 |
Component: | Services | Keywords: | n.a. |
Cc: | jrswedlow, java@…, wmoore | Story Points: | n.a. |
Sprint: | 2010-10-28 (18) | Importance: | n.a. |
Total Remaining Time: | n.a. | Estimated Remaining Time: | n.a. |
Description (last modified by jmoore)
This is caused by the FullTextAnalyzer (ticket:1010) not being used for wildcard searches:
- http://wiki.apache.org/lucene-java/LuceneFAQ#What_wildcard_search_support_is_available_from_Lucene.3F
- http://jira.atlassian.com/browse/JRA-15006
- http://stackoverflow.com/questions/2432486/lucene-wildcard-queries
- http://markmail.org/message/nfpoofjlyu4fkgcl#query:+page:1+mid:bkbjgtjvp5xz5igo+state:results
There's a proposed workaround in lucene-misc-2.4.1.jar called the AnalyzingQueryParser which does pass the search string to the analyzer, even with wildcards. To use it, however, we may need more investigation. (Especially the JIRA link above illustrates some of the issues one can run into)
Extending Ola's test with the following tests:
texts = ("*earch", "*h", "search tif", "search",\ "test", "tag", "t*", "search_test",\ "*test*.tif", "search*tif", "s .tif",\ ".tif", "tif", "*tif",\ "s*.tif", "*.tif")
I see the following terms fail for the default QueryParser:
- *.tif
- search*tif
- s*.tif
- *test*.tif
For the new AnalyzingQueryParser:
- *earch
- *h
- search*tif
- s*.tif
- *test*.tif
So, we can get "*.tif" back, but at the cost of "*earch" and "*h". With further investigation, we can probably come up with something that makes each of these cases pass, but other searches may then start to fail.
Possibly related is #1011 which would not use an analyzer at all on some fields like Image.name so that the underscores in "search_test_1.tif" don't get removed.
I'll commit the extended test and the lucene-misc jars and we can discuss further.
Change History (4)
comment:1 Changed 14 years ago by atarkowska
- Component changed from General to Services
- Priority changed from minor to blocker
comment:2 Changed 14 years ago by atarkowska
comment:3 Changed 14 years ago by jmoore
- Description modified (diff)
- Summary changed from BUG: Searching returns no results to BUG: Searching returns no results for leading wildcard term
(In [8385]) test, see #3164