Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.
Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #3164 (new)

Opened 14 years ago

Last modified 5 years ago

BUG: Searching returns no results for wildcard searches — at Version 12

Reported by: atarkowska Owned by: jamoore
Priority: blocker Milestone: OMERO-Beta4.3
Component: Services Keywords: n.a.
Cc: jrswedlow, java@…, wmoore Story Points: n.a.
Sprint: n.a. Importance: n.a.
Total Remaining Time: n.a. Estimated Remaining Time: n.a.

Description (last modified by jmoore)

This is caused by the FullTextAnalyzer (ticket:1010) not being used for wildcard searches:

There's a proposed workaround in lucene-misc-2.4.1.jar called the AnalyzingQueryParser which does pass the search string to the analyzer, even with wildcards. To use it, however, we may need more investigation. (Especially the JIRA link above illustrates some of the issues one can run into)

Extending Ola's test with the following tests:

        texts = ("*earch", "*h", "search tif", "search",\
                 "test", "tag", "t*", "search_test",\
                 "*test*.tif", "search*tif", "s .tif",\
                 ".tif", "tif", "*tif",\
                 "s*.tif", "*.tif")

I see the following terms fail for the default QueryParser:

  • *.tif
  • search*tif
  • s*.tif
  • *test*.tif


For the new AnalyzingQueryParser:

  • *earch
  • *h
  • search*tif
  • s*.tif
  • *test*.tif

So, we can get "*.tif" back, but at the cost of "*earch" and "*h". With further investigation, we can probably come up with something that makes each of these cases pass, but other searches may then start to fail.

Possibly related is #1011 which would not use an analyzer at all on some fields like Image.name so that the underscores in "search_test_1.tif" don't get removed.

I'll commit the extended test and the lucene-misc jars and we can discuss further.

Update

This issue is apparently not only restricted to leading wildcards, but other forms of wildcard searches. Moving to 4.3 for review.

Matching: test-project-a-b-c
=============================================
                         Query Found  Ok?
                          test    21 GOOD
                  test-project    21 GOOD
                 test\-project    21 GOOD
                         test-    21 GOOD
                 test-project-    21 GOOD
               test\-project\-    21 GOOD
                         test*    21 GOOD
                 test-project*     0 FAIL
                test\-project*     0 FAIL
                        test-*     0 FAIL
                test-project-*     0 FAIL
              test\-project\-*     0 FAIL
                    name:test*    21 GOOD
            name:test-project*     0 FAIL
           name:test\-project*     0 FAIL
                    name:test*    21 GOOD
        name:test name:project    21 GOOD
                  test project    21 GOOD
                test* project*    21 GOOD
                test- project-    21 GOOD
              test-* project-*     0 FAIL
            test-project-a-b-c    21 GOOD
                         a-b-c    21 GOOD
                         a b c    21 GOOD
                            t*    21 GOOD
                            p*    21 GOOD
                            a*    21 GOOD
                            b*    21 GOOD
                            c*    21 GOOD
                         t* p*    21 GOOD
                         proj*    21 GOOD
                    tes* proj*    21 GOOD

Change History (12)

comment:1 Changed 14 years ago by atarkowska

  • Component changed from General to Services
  • Priority changed from minor to blocker

comment:2 Changed 14 years ago by atarkowska

(In [8385]) test, see #3164

comment:3 Changed 14 years ago by jmoore

  • Description modified (diff)
  • Summary changed from BUG: Searching returns no results to BUG: Searching returns no results for leading wildcard term

comment:3 Changed 14 years ago by jmoore

comment:4 Changed 14 years ago by jmoore

(In [8386]) More tests of wildcard searching; and lucene-misc (See #3164)

comment:6 Changed 14 years ago by atarkowska

I'm not sure if it was known, but in collaborative group I am able to use "*tif", etc. If I switch to read-only or private group no results is returned.

comment:7 Changed 14 years ago by jmoore

(In [8393]) Adding search.py to integration_suite.py (See #3164)

comment:8 Changed 14 years ago by jmoore

(In [8396]) Testing various groups and disabling broken strings (See #3164)

comment:9 Changed 14 years ago by jburel

  • Sprint changed from 2010-10-28 (18) to 2010-11-11 (19)

Moved from sprint 2010-10-28 (18)

comment:10 Changed 14 years ago by jmoore

  • Milestone changed from OMERO-Beta4.2.1 to Unscheduled
  • Sprint 2010-11-11 (19) deleted

Not making any code modifications to support this for 4.2.1. I've linked this under #2097 (4.2+ search fixes) and am moving to "unscheduled". Hopefully, we will have a large search review after big images.

comment:11 Changed 13 years ago by jmoore

  • Description modified (diff)
  • Milestone changed from Unscheduled to OMERO-Beta4.3

comment:12 Changed 13 years ago by jmoore

  • Description modified (diff)
  • Summary changed from BUG: Searching returns no results for leading wildcard term to BUG: Searching returns no results for wildcard searches
Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.69405 sec.)

We're Hiring!