Task #10930 (new)
Opened 11 years ago
Last modified 9 years ago
Arrange Lucene index better for multiply hashed files
Reported by: | mtbcarroll | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | Unscheduled |
Component: | Search | Version: | n.a. |
Keywords: | n.a. | Cc: | server@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
In the longer term for OMERO 5.x, hash algorithms for files may be changed and new hashes thus generated. A Lucene index might end up with multiple hashes noted for the same file.
FullTextBridge.handleFileAnnotation's approach won't work for multiple algorithms because in search the appropriate file.hasher and file.hash value pairs can't be matched. Perhaps a rather better approach to indexing would be something like,
if (file.getHasher() != null && file.getHash() != null) { add(document, "file.hash." + file.getHasher().getValue(), file.getHash(), opts); }
Change History (3)
comment:1 Changed 9 years ago by mtbcarroll
- Cc server@… added; omero-team@… niko@… removed
- Version set to OMERO-5.1.3
comment:2 Changed 9 years ago by mtbcarroll
- Version OMERO-5.1.3 deleted
The Lucene document is rewritten each time. Only the one hash stored in the DB will be searchable until there's a DB upgrade.