Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.
Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #11675 (closed)

Opened 11 years ago

Closed 10 years ago

Last modified 9 years ago

BUG: java.io.IOException: Map failed

Reported by: spli Owned by:
Priority: critical Milestone: 5.0.0-rc1
Component: General Version: 4.4.9
Keywords: n.a. Cc: java@…
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: n.a.
Sprint: n.a.

Description (last modified by mtbcarroll)

See https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=7351&start=10#p13116

Java may not free MappedByteBuffers, which can lead to java.io.IOException: Map failed being thrown in
ome.io.nio.RomioPixelBuffer.getRegion(RomioPixelBuffer.java:343)

See for example

The last link suggests the following workaround

ByteBuffer buffer;
            try {
                buffer = channel.map(READ_ONLY, ofs, n);
            } catch (java.io.IOException e) {
                System.gc();
                System.runFinalization();
                buffer = channel.map(READ_ONLY, ofs, n);
            }

Note other uses of MappedByteBuffer may need to be checked.

Java version as reported by Douglas:

$ java -version
java version "1.6.0_33"
Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)

Change History (26)

comment:1 Changed 11 years ago by jamoore

  • Milestone changed from Unscheduled to 5.0.0-beta2
  • Priority changed from minor to critical

comment:2 Changed 11 years ago by mtbcarroll

  • Owner set to mtbcarroll

comment:3 Changed 10 years ago by mtbcarroll

  • Description modified (diff)
  • Status changed from new to accepted

From the point of view of adjusting existing code that uses mapped byte buffers, the cleanest-looking adjustment would involve writing another ByteBuffer subclass that offers a close() method that nulls a wrapped byte buffer and then gets GC and finalization run in a separate thread. However, the new subclass would have to be in the java.nio package because none of the constructors are public: is that acceptable?

comment:4 Changed 10 years ago by mtbcarroll

(I don't want to just wrap the buffer-using code with exception-catching because it may consume a bunch of memory and successfully complete only to cause an OOM from elsewhere in the codebase.)

comment:5 Changed 10 years ago by jamoore

Mark: was there any resolution on which Java versions suffer from this? Adding calls to gc and finalization seems less than ideal if we could prevent it.

comment:6 Changed 10 years ago by spli

https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=7351&start=20#p13123 suggests it might not be a problem with Java7. Then again since it's to do with memory allocation/GC I wouldn't be surprised if it's intermittent. Can we reproduce this with one of our test files on Java6?

comment:7 Changed 10 years ago by mtbcarroll

http://bugs.sun.com/view_bug.do?bug_id=4724038 suggests that the general issue is not fixed, though there is a rather incomplete-looking stab at relief at http://bugs.sun.com/view_bug.do?bug_id=6417205 in what looks to be 1.6b86.

Another option would be to simply refactor away from using MappedByteBuffers at all for this.

comment:8 Changed 10 years ago by mtbcarroll

The reports of problems that I've seen have been particular to only 32-bit JREs. Perhaps with 64-bit the GC and finalization always occurs before address space is depleted.

comment:9 Changed 10 years ago by spli

According to Douglas the original server error occured with 1.6 Java 64 bit.

comment:10 Changed 10 years ago by mtbcarroll

I will see if I can write a failing test.

(Update: no luck so far!)

Last edited 10 years ago by mtbcarroll (previous) (diff)

comment:11 Changed 10 years ago by mtbcarroll

He wouldn't be running the server on Windows, right?

comment:12 Changed 10 years ago by jamoore

Server log certainly used /home-style paths.

comment:13 Changed 10 years ago by mtbcarroll

Okay. Still no luck reproducing this, with OpenJDK 6 and Oracle Java SDK 1.6.

comment:14 Changed 10 years ago by mtbcarroll

Do we have the actual problem images, or some idea of which of ours may behave similarly? (Especially if the problem can be reliably reproduced.)

Last edited 10 years ago by mtbcarroll (previous) (diff)

comment:15 Changed 10 years ago by mtbcarroll

Note: whatever fixes are made for this ticket should be tested also on Windows, as its native memory allocation and management isn't necessarily as good.

comment:16 Changed 10 years ago by mtbcarroll

It should be noted that something like ((sun.nio.ch.DirectBuffer) buffer).cleaner().clean() may be of use even though it is horrifying and should be avoided. In particular, though it seems presently okay for both OpenJDK and Oracle's SDK, there is no official guarantee that the class will remain available.

Last edited 10 years ago by mtbcarroll (previous) (diff)

comment:17 Changed 10 years ago by mtbcarroll

"may be of use": https://github.com/openmicroscopy/openmicroscopy/pull/1865 opened accordingly. It affects the code quoted in this ticket's description, i.e. in calculating the checksum.

comment:18 Changed 10 years ago by jamoore

  • Owner mtbcarroll deleted

With Mark away, this will need to get looked into by someone else.

comment:19 Changed 10 years ago by jamoore

One sun error report also mentions : http://stackoverflow.com/questions/3773775/default-for-xxmaxdirectmemorysize which if useful (assuming we can ever reproduce this) would allow us to provide users something to try without needing to recompile.

comment:20 Changed 10 years ago by jamoore

Suggestion from Chris: see #6083 which was the same condition. Solution was to use a 64bit JVM.

comment:21 Changed 10 years ago by jamoore

  • Resolution set to fixed
  • Status changed from accepted to closed

I've added a static JVM configuration setting: omero.pixeldata.dispose in https://github.com/openmicroscopy/openmicroscopy/pull/1884

With that, we've probably done all we can do to contain this issue while giving people options, and I'm closing. If find a reproducible test case, we can consider extending this to test for 32-bit JVM, check for low-memory, etc.

comment:22 Changed 10 years ago by jmoore <josh@…>

(In [da313a48b014f5cba64a6075c6dac5aeba9ad00e/ome.git] on branch develop) Add omero.pixeldata.dispose for static config (See #11675)

Since the cleaning should only be necessary under certain conditions
(most likely a 32-bit JVM), the disposing of the ByteBuffers? held by
PixelData? instances can be en-/disabled with omero.pixeldata.dispose

comment:23 Changed 10 years ago by Josh Moore <josh@…>

(In [60666baced537e3d42efbc6aa9dbabc2f39147b7/ome.git] on branch develop) Merge pull request #1884 from joshmoore/11675-config-dispose

Add omero.pixeldata.dispose for static config (See #11675)

comment:25 Changed 9 years ago by jmoore <josh@…>

(In [d0d85da25ca5a14bd7c427c795adc43a53d03077/ome.git] on branch develop) Set omero.pixeldata.dispose=true by default (See #11675)

Regularly, the long_running Python tests cause my JVM to
segfault locally when out of memory conditions occur.
Setting dispose to true prevents this from happening.

This proposes setting true as the default. If issues
arise, sysadmins can manually set it back to false.

comment:26 Changed 9 years ago by Josh Moore <josh@…>

(In [da0b2239db83cef3b9aea8ee0bea6027872956df/ome.git] on branch develop) Merge pull request #3371 from joshmoore/dispose-true

Set omero.pixeldata.dispose=true by default (See #11675)

Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.69494 sec.)

We're Hiring!