User Story #860 (new)
Opened 16 years ago
Last modified 14 years ago
Add ServerErrorEvent subsystem for notification of internal errors — at Initial Version
Reported by: | jamoore | Owned by: | josh |
---|---|---|---|
Priority: | minor | Milestone: | Future |
Component: | Services | Keywords: | errors, exceptions, logging, asynchrnous |
Cc: | atarkowska, cxallan, jburel, jrswedlow | Story Points: | n.a. |
Sprint: | n.a. | Importance: | n.a. |
Total Remaining Time: | 4.0d | Estimated Remaining Time: | n.a. |
Description
With more asynchronous logic in the server -- full text search processing, job processing, etc. -- it's difficult for server adminstrators to find problems when they only show up in the rather bloated logs.
All asynchronous processing subsystems should start raising a ServerErrorEvent in addition to logging an exception. The event can be handled by multiple listeners. E.g.:
- A simple LoggingServerErrorEventListener can write a special log file
- A EmailingServerErrorEventListener can send an email to a specified admin (emails are disabled if the configuration property is set to "", e.g. omero.servererror.email=
- A WebAdminServerErrorEventListener could pass the information on to the WebAdmin? console which administrators could check periodically.
Events which are of importance include:
- CorruptedFileServerError - When the sha1 of a Pixels or an OriginalFile do not match the value in the DB
- LuceneLockedServerError - some forms of exceptions can leave Lucene in a locked state, making search mostly unusable.
- NoJobProcessorServerError - if all jobs are failling/not being accepted, then JobHandler is essentially useless. The problem may be that all compute nodes are down.
Perhaps an "error level" can determine, for example, whether or not an email will be sent.