Task #5526 (closed)
Bug: on first run, Lucene fails when no FullText directory exists
| Reported by: | jamoore | Owned by: | jamoore |
|---|---|---|---|
| Priority: | blocker | Milestone: | OMERO-Beta4.3.2 |
| Component: | Deployment | Version: | n.a. |
| Keywords: | n.a. | Cc: | dzmacdonald, jburel, cxallan |
| Resources: | n.a. | Referenced By: | n.a. |
| References: | n.a. | Remaining Time: | 0.0d |
| Sprint: | 2011-08-04 (2) |
Description
Testing on Windows, but I think independent of that, after:
mkdir C:\OMERO bin\omero admin start
the indexer fails with "Cannot find segment_N". After a restart, it works fine. There needs to be a forced creation of that directory, either via bin/omero admin start or via a Spring dependency.
See attached file.
Attachments (1)
Change History (14)
Changed 8 years ago by jmoore
comment:1 Changed 8 years ago by jmoore
- Sprint set to 2011-06-02 (13)
- Status changed from new to accepted
comment:2 Changed 8 years ago by jmoore <josh@…>
- Remaining Time changed from 0.1 to 0
- Resolution set to fixed
- Status changed from accepted to closed
(In [9b88e15e7febb11e85dfbba2e614b27b660bb8a8/ome.git] on branch develop) Adding ServerDirectoryCheck to create FullText (Fix #5526)
comment:3 Changed 8 years ago by jmoore <josh@…>
(In [d363c4a41b24d8570d5ccaab0688b0ff4c69a906/ome.git] on branch develop) Adding serverDirectoryCheck to indexer.xml (See #5526)
comment:4 Changed 8 years ago by jmoore <josh@…>
(In [0caac15131082abd78e0a393dbd283801dc8811f/ome.git] on branch develop) Add adding serverDirectoryCheck to pixeldata (See #5526, Fix #5625)
comment:5 Changed 8 years ago by jmoore
- Milestone changed from OMERO-Beta4.3 to OMERO-Beta4.3.1
- Resolution fixed deleted
- Sprint 2011-06-02 (13) deleted
- Status changed from closed to reopened
Seems to have arisen again: https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=723&p=2589#p2589
comment:6 Changed 8 years ago by jmoore
- Sprint set to 2011-07-07 (1)
comment:7 Changed 8 years ago by jmoore
- Cc cxallan added
- Remaining Time changed from 0 to 0.25
After initial testing, switching to the org.apache.lucene.store.NativeFSLockFactory implementation of org.apache.lucene.store.LockFactory as opposed to SimpleFSLockFactory seems to have gotten rid of the issue. The one caveat for SimpleFSLockFactory which is listed is that it can remain after a an abnormal JVM exit, which could be related. The one caveat for NativeFSLockFactory is that it behaves oddly on NFS due to the use of java.nio classes. However, we already rely (heavily) on java.nio locking, and so this seems to be a safe switch. Nevertheless, I plan to make the value configurable, so that if need be we can also introduce our own factory which retries the lock, etc.
See:
comment:8 Changed 8 years ago by jmoore <josh@…>
- Remaining Time changed from 0.25 to 0
- Resolution set to fixed
- Status changed from reopened to closed
(In [2834d93ea4388a6be0098586d2e49b3f96fecb85/ome.git] on branch develop) Setting default locking_strategy to 'native' (Fix #5526)
comment:9 Changed 8 years ago by jmoore
- Milestone changed from OMERO-Beta4.3.1 to OMERO-Beta4.3.2
- Remaining Time changed from 0 to 0.5
- Resolution fixed deleted
- Sprint changed from 2011-07-07 (1) to 2011-08-04 (2)
- Status changed from closed to reopened
Looks like this is back again. I created my data dir and started up 4.3.1 on Mac 10.6 to find in the blitz log:
Caused by: org.hibernate.search.SearchException: Unable to initialize index: FullText
at org.hibernate.search.store.FSDirectoryProvider.initialize(FSDirectoryProvider.java:47)
at org.hibernate.search.store.DirectoryProviderFactory.createDirectoryProvider(DirectoryProviderFactory.java:129)
... 94 more
Caused by: org.apache.lucene.index.CorruptIndexException: checksum mismatch in segments file
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:248)
at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:175)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1109)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:686)
at org.hibernate.search.store.DirectoryProviderHelper.createFSIndex(DirectoryProviderHelper.java:81)
at org.hibernate.search.store.FSDirectoryProvider.initialize(FSDirectoryProvider.java:44)
... 95 more
and in the indexer log:
Caused by: org.hibernate.search.SearchException: Unable to initialize: FullText
at org.hibernate.search.store.DirectoryProviderFactory.createDirectoryProvider(DirectoryProviderFactory.java:132)
at org.hibernate.search.store.DirectoryProviderFactory.createDirectoryProviders(DirectoryProviderFactory.java:63)
at org.hibernate.search.impl.SearchFactoryImpl.initDocumentBuilders(SearchFactoryImpl.java:404)
at org.hibernate.search.impl.SearchFactoryImpl.<init>(SearchFactoryImpl.java:119)
at org.hibernate.search.event.ContextHolder.getOrBuildSearchFactory(ContextHolder.java:30)
at org.hibernate.search.event.FullTextIndexEventListener.initialize(FullTextIndexEventListener.java:79)
at org.hibernate.event.EventListeners$1.processListener(EventListeners.java:198)
at org.hibernate.event.EventListeners.processListeners(EventListeners.java:181)
at org.hibernate.event.EventListeners.initializeListeners(EventListeners.java:194)
... 106 more
Caused by: org.hibernate.search.SearchException: Unable to initialize index: FullText
at org.hibernate.search.store.FSDirectoryProvider.initialize(FSDirectoryProvider.java:47)
at org.hibernate.search.store.DirectoryProviderFactory.createDirectoryProvider(DirectoryProviderFactory.java:129)
... 114 more
Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to delete /private/tmp/data2/FullText/write.lock
at org.apache.lucene.store.SimpleFSLock.release(SimpleFSLockFactory.java:149)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1121)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:686)
at org.hibernate.search.store.DirectoryProviderHelper.createFSIndex(DirectoryProviderHelper.java:81)
at org.hibernate.search.store.FSDirectoryProvider.initialize(FSDirectoryProvider.java:44)
... 115 more
comment:10 Changed 8 years ago by jmoore
- Owner jmoore deleted
- Status changed from reopened to new
comment:11 Changed 8 years ago by jmoore
- Owner set to jmoore
comment:12 Changed 8 years ago by jmoore
- Remaining Time changed from 0.5 to 0
- Resolution set to fixed
- Status changed from new to closed
I'm going to guess that this is a misconfiguration issue caused by not using a clean build. In the latest indexer stack trace there's this:
Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to delete /private/tmp/data2/FullText/write.lock
at org.apache.lucene.store.SimpleFSLock.release(SimpleFSLockFactory.java:149)
i.e. not the native lock implementation.
comment:13 Changed 8 years ago by jmoore
See also https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=776 -- possible issue when using openjdk.
Implemented first as part of admin.py, but as Chris pointed out, it's odd to have bin/omero admin ... creating directories. Instead, have implemented as a new startup bean in ome/services/startup.xml.