Task #11675 (closed)
BUG: java.io.IOException: Map failed
Reported by: | spli | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 5.0.0-rc1 |
Component: | General | Version: | 4.4.9 |
Keywords: | n.a. | Cc: | java@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description (last modified by mtbcarroll)
See https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=7351&start=10#p13116
Java may not free MappedByteBuffers, which can lead to java.io.IOException: Map failed being thrown in
ome.io.nio.RomioPixelBuffer.getRegion(RomioPixelBuffer.java:343)
See for example
- http://stackoverflow.com/questions/4666616/error-with-nio-while-trying-to-copy-large-file
- https://issues.apache.org/bugzilla/show_bug.cgi?id=49326
- http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6417205
The last link suggests the following workaround
ByteBuffer buffer; try { buffer = channel.map(READ_ONLY, ofs, n); } catch (java.io.IOException e) { System.gc(); System.runFinalization(); buffer = channel.map(READ_ONLY, ofs, n); }
Note other uses of MappedByteBuffer may need to be checked.
Java version as reported by Douglas:
$ java -version java version "1.6.0_33" Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Change History (26)
comment:1 Changed 11 years ago by jamoore
- Milestone changed from Unscheduled to 5.0.0-beta2
- Priority changed from minor to critical
comment:2 Changed 11 years ago by mtbcarroll
- Owner set to mtbcarroll
comment:3 Changed 10 years ago by mtbcarroll
- Description modified (diff)
- Status changed from new to accepted
comment:4 Changed 10 years ago by mtbcarroll
(I don't want to just wrap the buffer-using code with exception-catching because it may consume a bunch of memory and successfully complete only to cause an OOM from elsewhere in the codebase.)
comment:5 Changed 10 years ago by jamoore
Mark: was there any resolution on which Java versions suffer from this? Adding calls to gc and finalization seems less than ideal if we could prevent it.
comment:6 Changed 10 years ago by spli
https://www.openmicroscopy.org/community/viewtopic.php?f=5&t=7351&start=20#p13123 suggests it might not be a problem with Java7. Then again since it's to do with memory allocation/GC I wouldn't be surprised if it's intermittent. Can we reproduce this with one of our test files on Java6?
comment:7 Changed 10 years ago by mtbcarroll
http://bugs.sun.com/view_bug.do?bug_id=4724038 suggests that the general issue is not fixed, though there is a rather incomplete-looking stab at relief at http://bugs.sun.com/view_bug.do?bug_id=6417205 in what looks to be 1.6b86.
Another option would be to simply refactor away from using MappedByteBuffers at all for this.
comment:8 Changed 10 years ago by mtbcarroll
The reports of problems that I've seen have been particular to only 32-bit JREs. Perhaps with 64-bit the GC and finalization always occurs before address space is depleted.
comment:9 Changed 10 years ago by spli
According to Douglas the original server error occured with 1.6 Java 64 bit.
comment:10 Changed 10 years ago by mtbcarroll
I will see if I can write a failing test.
(Update: no luck so far!)
comment:11 Changed 10 years ago by mtbcarroll
He wouldn't be running the server on Windows, right?
comment:12 Changed 10 years ago by jamoore
Server log certainly used /home-style paths.
comment:13 Changed 10 years ago by mtbcarroll
Okay. Still no luck reproducing this, with OpenJDK 6 and Oracle Java SDK 1.6.
comment:14 Changed 10 years ago by mtbcarroll
Do we have the actual problem images, or some idea of which of ours may behave similarly? (Especially if the problem can be reliably reproduced.)
comment:15 Changed 10 years ago by mtbcarroll
Note: whatever fixes are made for this ticket should be tested also on Windows, as its native memory allocation and management isn't necessarily as good.
comment:16 Changed 10 years ago by mtbcarroll
It should be noted that something like ((sun.nio.ch.DirectBuffer) buffer).cleaner().clean() may be of use even though it is horrifying and should be avoided. In particular, though it seems presently okay for both OpenJDK and Oracle's SDK, there is no official guarantee that the class will remain available.
comment:17 Changed 10 years ago by mtbcarroll
"may be of use": https://github.com/openmicroscopy/openmicroscopy/pull/1865 opened accordingly. It affects the code quoted in this ticket's description, i.e. in calculating the checksum.
comment:18 Changed 10 years ago by jamoore
- Owner mtbcarroll deleted
With Mark away, this will need to get looked into by someone else.
comment:19 Changed 10 years ago by jamoore
One sun error report also mentions : http://stackoverflow.com/questions/3773775/default-for-xxmaxdirectmemorysize which if useful (assuming we can ever reproduce this) would allow us to provide users something to try without needing to recompile.
comment:20 Changed 10 years ago by jamoore
Suggestion from Chris: see #6083 which was the same condition. Solution was to use a 64bit JVM.
comment:21 Changed 10 years ago by jamoore
- Resolution set to fixed
- Status changed from accepted to closed
I've added a static JVM configuration setting: omero.pixeldata.dispose in https://github.com/openmicroscopy/openmicroscopy/pull/1884
With that, we've probably done all we can do to contain this issue while giving people options, and I'm closing. If find a reproducible test case, we can consider extending this to test for 32-bit JVM, check for low-memory, etc.
comment:22 Changed 10 years ago by jmoore <josh@…>
(In [da313a48b014f5cba64a6075c6dac5aeba9ad00e/ome.git] on branch develop) Add omero.pixeldata.dispose for static config (See #11675)
Since the cleaning should only be necessary under certain conditions
(most likely a 32-bit JVM), the disposing of the ByteBuffers? held by
PixelData? instances can be en-/disabled with omero.pixeldata.dispose
comment:23 Changed 10 years ago by Josh Moore <josh@…>
(In [60666baced537e3d42efbc6aa9dbabc2f39147b7/ome.git] on branch develop) Merge pull request #1884 from joshmoore/11675-config-dispose
Add omero.pixeldata.dispose for static config (See #11675)
comment:24 Changed 10 years ago by mtbcarroll
Note https://github.com/openmicroscopy/openmicroscopy/pull/1865#issuecomment-30127891 about exception handling in clean().
comment:25 Changed 9 years ago by jmoore <josh@…>
(In [d0d85da25ca5a14bd7c427c795adc43a53d03077/ome.git] on branch develop) Set omero.pixeldata.dispose=true by default (See #11675)
Regularly, the long_running Python tests cause my JVM to
segfault locally when out of memory conditions occur.
Setting dispose to true prevents this from happening.
This proposes setting true as the default. If issues
arise, sysadmins can manually set it back to false.
comment:26 Changed 9 years ago by Josh Moore <josh@…>
(In [da0b2239db83cef3b9aea8ee0bea6027872956df/ome.git] on branch develop) Merge pull request #3371 from joshmoore/dispose-true
Set omero.pixeldata.dispose=true by default (See #11675)
From the point of view of adjusting existing code that uses mapped byte buffers, the cleanest-looking adjustment would involve writing another ByteBuffer subclass that offers a close() method that nulls a wrapped byte buffer and then gets GC and finalization run in a separate thread. However, the new subclass would have to be in the java.nio package because none of the constructors are public: is that acceptable?