Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #12314 (closed)

Opened 6 years ago

Closed 5 years ago

Scripts test test_inputs fails locally with 'Process still running'

Reported by: sbesson Owned by:
Priority: minor Milestone: Unscheduled
Component: Scripting Version: 5.0.1
Keywords: n.a. Cc: cblackburn, jamoore, spli
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: n.a.
Sprint: n.a.

Description

As reported initially by Blazej in https://github.com/openmicroscopy/openmicroscopy/pull/2528#issuecomment-43868379 and reconfirmed by myself while testing, the test_inputs.py integration tests can fail systematically using local servers. This error is not reproducible on all local servers and does not happen on the CI node performing the integration tests.

Full stack trace of the pytest error (with extra logging) is provided below:

======================================================================== test session starts ========================================================================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2 -- /usr/local/opt/python/bin/python2.7
collected 1 items 

test/integration/scriptstest/test_inputs.py:98: TestInputs.testInputs FAILED

============================================================================= FAILURES ==============================================================================
_______________________________________________________________________ TestInputs.testInputs _______________________________________________________________________

self = <test_inputs.TestInputs object at 0x10513b550>

    def testInputs(self):
        import logging
        logging.basicConfig(level=10)
        scripts = self.root.getSession().getScriptService()
        sendfile = SENDFILE % self.omeropydir()
        id = scripts.uploadScript(
            "/tests/inputs_py/%s.py" % self.uuid(), sendfile)
        input = {"a": rint(100)}
        impl = omero.processor.usermode_processor(self.root)
        try:
            process = scripts.runScript(id, input, None)
            cb = omero.scripts.ProcessCallbackI(self.root, process)
            try:
                count = 100
                while cb.block(500):
                    count -= 1
                    assert count != 0
                rc = process.poll()
>               results = process.getResults(0)

test/integration/scriptstest/test_inputs.py:116: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = ProcessCallback/f6f8e0a7-20c0-43d9-b67b-3aaac69c0b35 -t -e 1.0:tcp -h 129.215.237.135 -p 53526, waitSecs = 0, _ctx = None

    def getResults(self, waitSecs, _ctx=None):
>       return _M_omero.grid.ScriptProcess._op_getResults.invoke(self, ((waitSecs, ), _ctx))
    
    '''Returns the results immediately if present. If the process
    is not yet finished, waits "waitSecs" before throwing an
    [omero.ApiUsageException]. If poll has returned a non-null
E   ApiUsageException: exception ::omero::ApiUsageException
E   {
E       serverStackTrace = 
E       serverExceptionClass = 
E       message = Process still running.
E   }

target/omero_Scripts_ice.py:929: ApiUsageException
-------------------------------------------------------------------------- Captured stderr --------------------------------------------------------------------------
DEBUG:omero.processor.ProcessorI:Created
INFO:omero.util.Resources:Starting
DEBUG:omero.util.Resources:Adding object <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Executing
INFO:omero.processor.ProcessorI:Registering processor 67\"C-%T(pB}Q0!0n4;ij/UsermodeProcessor-86c0ea3c-fa5b-4dd5-822e-5ff5c02f3656 -t -e 1.0:tcp -h 127.0.0.1 -p 53489
DEBUG:omero.util.Resources:Checking <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Sleeping 60
INFO:omero.remote: Meth: ProcessorI.willAccept
DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:2 - Valid: True
INFO:omero.remote: Rslt: True
INFO:omero.remote: Meth: ProcessorI.willAccept
DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:3 - Valid: True
INFO:omero.remote: Rslt: True
INFO:omero.remote: Meth: ProcessorI.parseJob
INFO:omero.processor.ProcessorI:parseJob: Session = 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c, JobId = 3
INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7
INFO:omero.processor.ProcessorI:Using process: 
DEBUG:omero.processor.ProcessI:Created
DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
INFO:omero.processor.ProcessI:Created 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c in /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c>
INFO:omero.processor.ProcessorI:Downloaded file: 32
INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Activated
DEBUG:omero.perf:start[1400841236783] time[95] tag[omero.processor.ProcessorI.process]
INFO:omero.remote: Meth: ProcessI.wait
INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Waiting
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Deactivating
INFO:omero.remote: Meth: ProcessI.shutdown
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841237457] time[0] tag[omero.processor.ProcessI.shutdown]
DEBUG:omero.processor.ProcessI:client: session-5501fdb3-d4ae-4269-8fa2-6eacbf6d44a0/6e8ce0fe-1896-485a-a3f5-ec46a61bf16c -t -e 1.0 @ BlitzAdapters
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Changed job status from Running to Error
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : No stdout
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Uploaded 424 bytes of /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir/err to 33
DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Lived 0s. Deactivation took 0s.
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Callback processFinished
INFO:omero.remote: Rslt: 1
DEBUG:omero.perf:start[1400841236879] time[766] tag[omero.processor.ProcessI.wait]
INFO:omero.remote: Rslt: object #0 (::omero::grid::JobParams)
{
    name = test_inputs.py
    version = 
    description = 
    contact = 
    au
INFO:omero.remote: Meth: ProcessorI.processJob
INFO:omero.processor.ProcessorI:processJob: Session = b9980a13-f9d9-4881-a0aa-279527f6e461, JobId = 2
DEBUG:omero.processor.ProcessorI:Checking params for job 2
INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7
INFO:omero.processor.ProcessorI:Using process: 
DEBUG:omero.processor.ProcessI:Created
DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
INFO:omero.processor.ProcessI:Created b9980a13-f9d9-4881-a0aa-279527f6e461 in /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461>
INFO:omero.processor.ProcessorI:Downloaded file: 32
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Activated
DEBUG:omero.perf:start[1400841237718] time[88] tag[omero.processor.ProcessorI.process]
INFO:omero.remote: Rslt: 67\"C-%T(pB}Q0!0n4;ij/Process-9fd869ba-4d6e-4465-9059-3aaa9c7ee573 -t -e 1.0:tcp -h 127.0.0.1 -p 53489
INFO:omero.remote: Meth: ProcessI.registerCallback
DEBUG:omero.processor.ProcessI:Added callback: ProcessCallback/888fdc98-08a8-495f-97cb-f8ec065ce4e8
INFO:omero.remote: Rslt: None
INFO:omero.remote: Meth: ProcessI.registerCallback
DEBUG:omero.processor.ProcessI:Added callback: 67"C-%T(pB}Q0!0n4;ij/7b0f9995-7a84-4526-9bb3-e85b5267f0d6
INFO:omero.remote: Rslt: None
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841237837] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238340] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238342] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238343] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.processor.ProcessorI:Cleaning up
DEBUG:omero.util.Resources:Cleaning <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Cleaning <proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c>
DEBUG:omero.perf:start[1400841238346] time[0] tag[omero.processor.ProcessI.cleanup]
DEBUG:omero.util.Resources:Cleaning <proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461>
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Deactivating
INFO:omero.remote: Meth: ProcessI.shutdown
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Shutdown
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238346] time[1] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 25 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 19 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 13 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 7 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 1 seconds.
INFO:omero.remote: Meth: ProcessI.kill
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : popen.kill(True)
DEBUG:omero.processor.ProcessI:err on pid=10394 iskill=True : [Errno 1] Operation not permitted
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processKilled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238350] time[0] tag[omero.processor.ProcessI.kill]
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238346] time[4] tag[omero.processor.ProcessI.shutdown]
DEBUG:omero.processor.ProcessI:client: session-3760fe3d-0d73-42d8-96f0-b4b7fa46110f/b9980a13-f9d9-4881-a0aa-279527f6e461 -t -e 1.0 @ BlitzAdapters
INFO:omero.util.Resources:Halted
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Changed job status from Running to Cancelled
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stdout
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stderr
DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Lived 0s. Deactivation took 0s.
DEBUG:omero.processor.ProcessI:Can't get session for cleanup
DEBUG:omero.perf:start[1400841238346] time[92] tag[omero.processor.ProcessI.cleanup]
DEBUG:omero.util.Resources:Cleanup done
INFO:omero.processor.ProcessorI:Done
===================================================================== 1 failed in 14.01 seconds =====================================================================
sebastien@sce-bio-c03722:OmeroPy ((71bbbc7...)) $ 

Change History (10)

comment:1 Changed 6 years ago by jamoore

As mentioned via chat, this may be an issue of the check process.poll() return non-null in one context but not in another due to brief concurrency issues. Note: server-side we wrap the Python ProcessPrx with a ScriptProcessPrx. The two may need to be better synchronized or allow for some delay.

comment:2 Changed 6 years ago by jamoore

Another error in the same test, though this time more of a traditional "timeout": http://ci.openmicroscopy.org/view/Failing/job/OMERO-5.0-merge-integration-python/261/testReport/test.integration.scriptstest.test_inputs/TestInputs/testInputs/

I'm assuming it will also be transient.

comment:4 Changed 6 years ago by jamoore

  • Milestone changed from 5.1.0-m4 to 5.1.1

comment:5 Changed 5 years ago by jamoore

  • Cc cblackburn added; bpindelski removed

Thoughts on pushing?

comment:6 Changed 5 years ago by jamoore

  • Milestone changed from 5.1.1 to 5.1.2

Pushing.

comment:7 Changed 5 years ago by jamoore

  • Milestone changed from 5.1.4 to OMERO-5.1.4

Splitting 5.1.4 due to milestone decoupling

comment:8 Changed 5 years ago by jburel

  • Milestone changed from OMERO-5.1.4 to 5.x

comment:9 Changed 5 years ago by jamoore

  • Milestone changed from 5.x to Unscheduled

comment:10 Changed 5 years ago by sbesson

  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.194022 sec.)

We're Hiring!