Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.
Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #11709 (closed)

Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

BUG: Tile loading failure not retried

Reported by: pwalczysko Owned by: wmoore
Priority: critical Milestone: 5.0.0-rc1
Component: Web Version: 5.0.0-beta1
Keywords: n.a. Cc: fs@…, ux@…, mlinkert
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: n.a.
Sprint: OMERO 5 Beta 2 (1)

Description

Reported during the Phase I of upgrade testing by @bpindelski (line 17 of the gdoc https://docs.google.com/spreadsheet/ccc?key=0AoKiTAl8UOxndGt5bm5mOWl4M1lMc1NkQzVVRW5CZGc&usp=drive_web#gid=13) This was performed on omero4-demo server with develop database which was upgraded from dev_4_4 database (originally from Howe). The images were all the original dev_4_4 upgraded images, no new FS imports were in the DB at that point. bpindelski was viewing big images.

bpindelski:
"Broken image" icons for some tiles. Tiles re-appear when zoom level changed. Might be related to caching.

PW:
PW: cannot repeat on this image, but repeated twice on IE 9 Win7 and Firefox Mac 10.8 on gatan - see 2 screenshots - repeatable, and on Howe this very image has no problems - this is upgrade specific ?

Attachments (4)

broken_tiles.png (413.9 KB) - added by pwalczysko 8 years ago.
Tile loading hanging IE9 Win 7.png (641.6 KB) - added by pwalczysko 8 years ago.
Tile loading hanging Firefox Mac 10.8.png (409.9 KB) - added by pwalczysko 8 years ago.
Tile loading hanging Firefox Mac 10.8 b.png (336.6 KB) - added by pwalczysko 8 years ago.

Download all attachments as: .zip

Change History (22)

Changed 8 years ago by pwalczysko

Changed 8 years ago by pwalczysko

Changed 8 years ago by pwalczysko

Changed 8 years ago by pwalczysko

comment:1 Changed 8 years ago by pwalczysko

Was able to repeat on Win7 IE9 even after I have cleared my browser cache.

comment:2 Changed 8 years ago by jamoore

  • Version changed from 3.0-Beta1 to 5.0.0-beta1

comment:3 Changed 8 years ago by pwalczysko

@wmoore: I have imported the image to Gretzky. Tested Firefox on Mac - no problems on Gretzky detected.

comment:4 Changed 8 years ago by wmoore

I'm seeing this in the omero4-demo web logs when tiles are missing

2013-11-18 11:17:17,223 WARNI [                           omero.gateway] (proc.26515) debug:3514 LockTimeout on <class 'omeroweb.webclient.webclient_gateway.OmeroWebSafeCallWrapper'> to <eb0f5296-aef1-407d-84eb-00672990b7a9omero.api.RenderingEngine> load((<ServiceOptsDict: {'omero.session.uuid': 'e5531aef-2e2b-47c4-b52c-64e58aafc42d', 'omero.group': '6', 'omero.user': '13', 'omero.client.uuid': '08deb50b-e2ca-408c-bfb8-4c6c53c6bc97'}>,), {})
Traceback (most recent call last):
  File "/opt/hudson/workspace/OMERO-merge-develop-FS/src/dist/lib/python/omero/gateway/__init__.py", line 3532, in __call__
    return self.f(*args, **kwargs)
  File "/opt/hudson/workspace/OMERO-merge-develop-FS/src/dist/lib/python/omero_api_RenderingEngine_ice.py", line 410, in load
    return _M_omero.api.RenderingEngine._op_load.invoke(self, ((), _ctx))
LockTimeout: exception ::omero::LockTimeout
{
    serverStackTrace = ome.conditions.LockTimeout: /repositories/OMERO-merge-develop-FS/Pixels/Dir-007/7532_pyramid is locked by others
    at ome.io.bioformats.BfPyramidPixelBuffer.initializeReader(BfPyramidPixelBuffer.java:189)
    at ome.io.bioformats.BfPyramidPixelBuffer.<init>(BfPyramidPixelBuffer.java:173)
    at ome.io.bioformats.BfPyramidPixelBuffer.<init>(BfPyramidPixelBuffer.java:143)
    at ome.io.nio.PixelsService.createPyramidPixelBuffer(PixelsService.java:766)
    at ome.io.nio.PixelsService._getPixelBuffer(PixelsService.java:487)
...
	at Ice.ConnectionI.message(ConnectionI.java:1163)
	at IceInternal.ThreadPool.run(ThreadPool.java:302)
	at IceInternal.ThreadPool.access$300(ThreadPool.java:12)
	at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:643)
	at java.lang.Thread.run(Thread.java:722)

    serverExceptionClass = ome.conditions.LockTimeout
    message = /repositories/OMERO-merge-develop-FS/Pixels/Dir-007/7532_pyramid is locked by others
    backOff = 15000
    seconds = 0
}
2013-11-18 11:17:17,235 ERROR [                 omeroweb.feedback.views] (proc.26515) handler500:141 handler500: Server error

Rendering Engine init fails during image.setActiveChannels().

comment:5 Changed 8 years ago by wmoore

To reproduce, log in as user-10 to https://omero4-demo.openmicroscopy.org:1443/webclient/img_detail/9732/ and zoom, pan etc. Not browser specific as far as I can see.

comment:6 Changed 8 years ago by wmoore

Blitz image.getChannels() is failing to get a rendering engine initialised. Since it has a decorator that states that we can ignore ConcurrencyExceptions?,
we don't handle this. We decided to ignore this here, since getChannels() is sometimes used to get the channel names, and we don't need RE then. Really, we should be trying again after a "short" time (how long is short)?

    @assert_re(ignoreExceptions=(omero.ConcurrencyException))
    def getChannels (self):

comment:7 Changed 8 years ago by wmoore

One solution would be to wrap setActiveChannels() with a @assert_re() (not ignoring ConcurrencyException?).

Then, assert_re.call() needs to handle ConcurrencyException? - or we have to handle it somewhere else?

comment:8 Changed 8 years ago by wmoore

  • Owner changed from web-team@… to wmoore

comment:9 Changed 8 years ago by wmoore

Chris - you have an idea about how we want to handle this? If we assume that after a ConcurrencyException? we can retry very quickly (E.g. less than a second) then potentially we could do this directly in the Blitz Gateway - under assert_re.call(). However, if it needs to be a lot longer, then we probably need to return something that tells the Javascript to try again some time later - but this is much harder. Since we're expecting to get back a rendered tile (jpeg) what can we send back that tells PanoJs? to try loading the tile again in 10 - 15 seconds (or longer)??

comment:10 Changed 8 years ago by wmoore

Summary of Skype call - Josh, Chris, Carlos & me (Will). Josh will look at reducing the pyramid File lock to be a read-only lock, allowing many simultaneous reads/inits. I will look at adding error handling to the tiles (or tiles container) http://api.jquery.com/error/. If a tile fails to load, retry the load ONCE. If that fails, see if we can give the user a "refresh" button that reloads tiles (just failed tiles??) without losing viewport location.

comment:11 Changed 8 years ago by wmoore

Seems that event bubbling of "error" doesn't work http://forum.jquery.com/topic/error-event-with-live so we'll have to add it to each tile img.

comment:12 Changed 8 years ago by wmoore

  • Summary changed from BUG: Tile loading stuck after upgrade to BUG: Tile loading failure not retried

Intermittent tile failure is also an issue on dev_4_4 E.g. pan around this public image at 100% https://nightshade.openmicroscopy.org/webgateway/img_detail/3946831/
PR opened on dev_4_4 for testing https://github.com/openmicroscopy/openmicroscopy/pull/1839

comment:13 Changed 8 years ago by jamoore

Can this be closed with the open PRs or is there more work so that this needs to be pushed?

comment:14 Changed 8 years ago by wmoore

  • Resolution set to fixed
  • Status changed from new to closed

At one point we discussed the idea of having a "Refresh" button to reload tiles that failed on their second attempt (or never loaded for some other reason) but I'm not sure this is necessary now. Certainly not in this "Phase 1". So we can close this now..

comment:15 Changed 8 years ago by Will Moore <will@…>

(In [e33174f56a9d80b7830b22ec3dfee6b57dc955a0/ome.git] on branch develop) Initial error handling of tile image loading in PanoJS See #11709

comment:16 Changed 8 years ago by Josh Moore <josh@…>

(In [0ca9b6a1a2c60436923f222b4fa5e7775cc851ac/ome.git] on branch develop) Merge pull request #1818 from will-moore/tile_loading_11709

Error handling of tile image loading in PanoJS See #11709

comment:17 Changed 8 years ago by Will Moore <will@…>

(In [2880017202c2d44491e7f6e88a535876bce0c407/ome.git] on branch dev_4_4) Initial error handling of tile image loading in PanoJS See #11709

comment:18 Changed 8 years ago by Josh Moore <josh@…>

(In [31830e3f2735d4da8253f4f8ba74570ab096c99c/ome.git] on branch dev_4_4) Merge pull request #1839 from will-moore/tile_loading_11709_dev_4_4

Error handling of tile image loading in PanoJS See #11709

Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.68078 sec.)

We're Hiring!