Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.
Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #10662 (closed)

Opened 11 years ago

Closed 9 years ago

ome-mage / emdb instability

Reported by: khgillen Owned by: wmoore
Priority: major Milestone: Work in Progress
Component: Web Version: 5.0.8
Keywords: n.a. Cc: jamoore
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: n.a.
Sprint: n.a.

Description

This has existed as Mantis ticket: https://mantis.lifesci.dundee.ac.uk/view.php?id=97773 since the server was provisioned.

Server provisioning and deployment took place in Dec 2012. Since then, we have had intermittent outages which appear to correct themselves after a period of time, downtimes usually lasting around 5 to 15mins duration.

Checks at a 5m resolution:

Uptime: 93.50%

Downtime
1d 23h 30m

Number of Downtimes
358

You may want to move this to a different milestone if you're going to work on it.

Kenny

Change History (5)

comment:1 Changed 11 years ago by jamoore

  • Cc jamoore added
  • Owner changed from wmoore, jamoore to wmoore

comment:2 Changed 11 years ago by khgillen

Traceback (most recent call last):

File "/mage/staging/OMERO-CURRENT/lib/python/django/core/handlers/base.py", line 92, in get_response

response = callback(request, *callback_args, callback_kwargs)

File "/mage/staging/OMERO-CURRENT/lib/python/omeroweb/webemdb/views.py", line 820, in index

conn = getConnection(request)

File "/mage/staging/OMERO-CURRENT/lib/python/omeroweb/webemdb/views.py", line 1126, in getConnection

logger.debug('emdb connection: %s server %s' % (conn._sessionUuid, blitz.host))

AttributeError?: 'NoneType?' object has no attribute '_sessionUuid'
Another stacktrace:

<WSGIRequest
GET:<QueryDict: {}>,
POST:<QueryDict: {}>,
COOKIES:{},
META:{'DOCUMENT_ROOT': '/var/www/html',

'GATEWAY_INTERFACE': 'CGI/1.1',
'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'HTTP_ACCEPT_ENCODING': 'gzip, deflate',
'HTTP_ACCEPT_LANGUAGE': 'en-us',
'HTTP_CONNECTION': 'keep-alive',
'HTTP_HOST': 'emdb.openmicroscopy.org.uk',
'HTTP_USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit?/536.29.13 (KHTML, like Gecko) Version/6.0.4 Safari/536.29.13',
'PATH': '/sbin:/usr/sbin:/bin:/usr/bin',
'PATH_INFO': u'/webemdb/',
'PATH_TRANSLATED': '/mage/staging/OMERO-CURRENT/var/omero.fcgi/webemdb/',
'QUERY_STRING': ,
'REMOTE_ADDR': '10.12.0.185',
'REMOTE_PORT': '59698',
'REQUEST_METHOD': 'GET',
'REQUEST_URI': '/webemdb/',
'SCRIPT_FILENAME': '/mage/staging/OMERO-CURRENT/var/omero.fcgi',
'SCRIPT_NAME': u
,
'SCRIPT_URI': 'http://emdb.openmicroscopy.org.uk/webemdb/',
'SCRIPT_URL': '/webemdb/',
'SERVER_ADDR': '134.36.65.232',
'SERVER_ADMIN': 'root@localhost',
'SERVER_NAME': 'emdb.openmicroscopy.org.uk',
'SERVER_PORT': '80',
'SERVER_PROTOCOL': 'HTTP/1.1',
'SERVER_SIGNATURE': ,
'SERVER_SOFTWARE': 'Apache/2.2.15 (CentOS)',
'wsgi.errors': <flup.server.fcgi_base.TeeOutputStream? object at 0x4238990>,
'wsgi.input': <flup.server.fcgi_base.InputStream? object at 0x5363650>,
'wsgi.multiprocess': True,
'wsgi.multithread': False,
'wsgi.run_once': False,
'wsgi.url_scheme': 'http',
'wsgi.version': (1, 0)}>

comment:3 Changed 11 years ago by wmoore

We seem to be getting logging statements from the code below when connection fails.

Just turned debug on so we get the stack traces from here too:

$ bin/omero config set omero.web.debug true

def _createConnection (server_id, sUuid=None, username=None, passwd=None, host=None, port=None, retry=True, group=None, try_super=False, secure=False, anonymous=False, useragent=None):
    """
    Attempts to create a L{omero.gateway.BlitzGateway} connection.
    Tries to join an existing session for the specified user, using sUuid.
    
    @param server_id:   Way of referencing the server, used in connection dict keys. Int or String
    @param sUuid:       Session ID - used for attempts to join sessions etc without password
    @param username:    User name to log on with
    @param passwd:      Password
    @param host:        Host name
    @param port:        Port number
    @param retry:       Boolean
    @param group:       String? TODO: parameter is ignored. 
    @param try_super:   If True, try to log on as super user, 'system' group
    @param secure:      If True, use an encrypted connection
    @param anonymous:   Boolean
    @param useragent:   Log which python clients use this connection. E.g. 'OMERO.webadmin'
    @return:            The connection
    @rtype:             L{omero.gateway.BlitzGateway}
    """
    try:
        blitzcon = client_wrapper(username, passwd, host=host, port=port, group=None, try_super=try_super, secure=secure, anonymous=anonymous, useragent=useragent)
        blitzcon.connect(sUuid=sUuid)
        blitzcon.server_id = server_id
        blitzcon.user = UserProxy(blitzcon)
        if blitzcon._anonymous and hasattr(blitzcon.c, 'onEventLogs'):
            logger.debug('Connecting weblitz_cache to eventslog')
            def eventlistener (e):
                return webgateway_cache.eventListener(server_id, e)
            blitzcon.c.onEventLogs(eventlistener)
        return blitzcon
    except:
        logger.debug(traceback.format_exc())
        if not retry:
            return None
        logger.error("Critical error during connect, retrying after _purge")
        logger.debug(traceback.format_exc())
        _purge(force=True)
        return _createConnection(server_id, sUuid, username, passwd, retry=False, host=host, port=port, group=None, try_super=try_super, anonymous=anonymous, useragent=useragent)

def _purge (force=False):
    if force or len(connectors) > CONNECTOR_POOL_SIZE:
        keys = connectors.keys()
        for i in range(int(len(connectors)*CONNECTOR_POOL_KEEP)):
            try:
                c = connectors.pop(keys[i])
                c.seppuku(softclose=True)
            except:
                logger.debug(traceback.format_exc())
        logger.info('reached connector_pool_size (%d), size after purge: (%d)' %
                    (CONNECTOR_POOL_SIZE, len(connectors)))

comment:4 Changed 9 years ago by jburel

  • Version set to 5.0.8

Can we close that ticket?
I will assume we can

comment:5 Changed 9 years ago by wmoore

  • Resolution set to fixed
  • Status changed from new to closed

Yep - all fine now.

Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.64891 sec.)

We're Hiring!