Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #7338 (closed)

Opened 8 years ago

Closed 8 years ago

Bug: mage: too many open files

Reported by: wmoore Owned by: wmoore
Priority: critical Milestone: OMERO-4.4
Component: Services Version: n.a.
Keywords: n.a. Cc: cxallan, jamoore
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: n.a.
Sprint: 2011-12-13 (4)

Description

Too many Pixels files open.

1136 files open in total this time:

$ lsof
...
java    27169 jboss 1009u   REG               8,17     82944     3122 /OMERO/Pixels/Dir-001/1055
java    27169 jboss 1010u   REG               8,17     82944     1271 /OMERO/Pixels/325
java    27169 jboss 1011u   REG               8,17     82944     7368 /OMERO/Pixels/Dir-002/2124
java    27169 jboss 1012u   REG               8,17     82944     1187 /OMERO/Pixels/285
java    27169 jboss 1013u   REG               8,17     82944     1047 /OMERO/Pixels/220
java    27169 jboss 1014u   REG               8,17     82944     7147 /OMERO/Pixels/Dir-002/2043
java    27169 jboss 1015u   REG               8,17     82944     7407 /OMERO/Pixels/Dir-002/2136
java    27169 jboss 1016u   REG               8,17     82944     1011 /OMERO/Pixels/204
java    27169 jboss 1018u   REG               8,17     82944     7290 /OMERO/Pixels/Dir-002/2096
java    27169 jboss 1019u   REG               8,17     82944     3122 /OMERO/Pixels/Dir-001/1055
java    27169 jboss 1020u   REG               8,17     82944     6907 /OMERO/Pixels/Dir-001/1947
java    27169 jboss 1021u   REG               8,17     82944     6494 /OMERO/Pixels/Dir-001/1922
java    27169 jboss 1022u   REG               8,17     82944     3119 /OMERO/Pixels/Dir-001/1040
java    27169 jboss 1023u   REG               8,17     82944     5833 /OMERO/Pixels/Dir-001/1912

Logs and 'full lsof' attached below.

NB: file handle limit still quite low. Is it enough to fix this?

jboss@mage ~/OMERO-CURRENT $ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 24176
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 24176
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Attachments (3)

open_files_logs.zip (38.2 MB) - added by wmoore 8 years ago.
log files with "too many open files"
lsof.txt (113.2 KB) - added by wmoore 8 years ago.
ps_auxwf.txt (15.1 KB) - added by wmoore 8 years ago.

Change History (8)

Changed 8 years ago by wmoore

log files with "too many open files"

Changed 8 years ago by wmoore

Changed 8 years ago by wmoore

comment:1 Changed 8 years ago by wmoore

Mage was restarted yesterday morning. By yesterday afternoon:

jboss@mage ~/OMERO-CURRENT $ ps auxwf
...
jboss      421  0.9 14.3 2535828 445212 ?      Sl   10:08   2:06  \_ /usr/lib/jvm/sun-jdk-1.6/bin/java -Xmx2048M -XX:MaxPermSize=128M -XX:MaxPermSize=128m -Djava.awt.headless=true -Dlog4j.configuration=etc
...
# when refreshing page with lots of thumbnails
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
370
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
367
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
374
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
374
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
385

Now, today 30 hrs after restart

jrs-macbookpro-25107:~ will$ ssh jboss@mage.openmicroscopy.org.uk
Last login: Mon Nov 28 10:23:15 GMT 2011 from jrs-macbookpro-25107.dyn.lifesci.dundee.ac.uk on ssh
-bash: /home/jboss/EMAN2/eman2.bashrc: No such file or directory
jboss@mage ~ $ lsof -p 421 | wc -l
662

comment:2 Changed 8 years ago by jmoore

  • Sprint changed from 2011-11-29 (3) to 2011-12-13 (4)

Moved from sprint 2011-11-29 (3)

comment:3 Changed 8 years ago by wmoore

jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | grep Pixels | wc -l
561
jboss@mage ~/OMERO-CURRENT $ lsof -p 421 | wc -l
845

comment:4 Changed 8 years ago by wmoore

Closed all uses of rawPixelsStore in webemdb views.py with a try / finally block http://github.com/will-moore/openmicroscopy/commit/62eb0596cc6224ac2c270f5a230706357b4c187d
Copied these changes over to mage, stopped server, created a clean Blitz.log and restarted

$ mv var/log/Blitz-0.log var/log/Blitz-0.log.3

Now it seems that doing projections, EMAN2 filtering or download dataset as stack do not accumulate file handles above a small number:

jboss@mage ~/OMERO-CURRENT $ lsof -p 15264 | grep Pixels | wc -l
22

jboss@mage ~/OMERO-CURRENT $ lsof -p 15264 | grep Pixels
java    15264 jboss  mem    REG               8,17    171500    11452 /OMERO/Pixels/Dir-002/2564
java    15264 jboss  147u   REG               8,17      5184     9680 /OMERO/Pixels/Dir-002/2302
java    15264 jboss  150u   REG               8,17     82944     2512 /OMERO/Pixels/822
java    15264 jboss  151u   REG               8,17     82944     7204 /OMERO/Pixels/Dir-002/2063
java    15264 jboss  154u   REG               8,17    171500    11460 /OMERO/Pixels/Dir-002/2565
java    15264 jboss  158u   REG               8,17    171500    11460 /OMERO/Pixels/Dir-002/2565
java    15264 jboss  159u   REG               8,17     82944     7287 /OMERO/Pixels/Dir-002/2097
java    15264 jboss  160u   REG               8,17  67108864     9696 /OMERO/Pixels/Dir-002/2296
java    15264 jboss  161u   REG               8,17      5184     9628 /OMERO/Pixels/Dir-002/2283
java    15264 jboss  162u   REG               8,17     82944     7299 /OMERO/Pixels/Dir-002/2100
java    15264 jboss  163u   REG               8,17     82944     7459 /OMERO/Pixels/Dir-002/2152
java    15264 jboss  164u   REG               8,17     82944     7464 /OMERO/Pixels/Dir-002/2155
java    15264 jboss  165u   REG               8,17     82944     2481 /OMERO/Pixels/824
java    15264 jboss  166u   REG               8,17     82944     7037 /OMERO/Pixels/Dir-002/2002
java    15264 jboss  167u   REG               8,17     82944     2862 /OMERO/Pixels/963
java    15264 jboss  168u   REG               8,17     82944     7287 /OMERO/Pixels/Dir-002/2097
java    15264 jboss  171u   REG               8,17 137312500      582 /OMERO/Pixels/83
java    15264 jboss  173u   REG               8,17    171500    11452 /OMERO/Pixels/Dir-002/2564
java    15264 jboss  176u   REG               8,17    171500    11452 /OMERO/Pixels/Dir-002/2564
java    15264 jboss  177u   REG               8,17    171500    11452 /OMERO/Pixels/Dir-002/2564
java    15264 jboss  178u   REG               8,17     82944     2485 /OMERO/Pixels/827

Will keep an eye on this as before...

comment:5 Changed 8 years ago by wmoore

  • Resolution set to fixed
  • Status changed from new to closed

Mage seems fine now, closing...

Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.79064 sec.)

We're Hiring!