Task #12314 (closed)
Opened 10 years ago
Closed 8 years ago
Scripts test test_inputs fails locally with 'Process still running'
Reported by: | sbesson | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | Unscheduled |
Component: | Scripting | Version: | 5.0.1 |
Keywords: | n.a. | Cc: | cblackburn, jamoore, spli |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
As reported initially by Blazej in https://github.com/openmicroscopy/openmicroscopy/pull/2528#issuecomment-43868379 and reconfirmed by myself while testing, the test_inputs.py integration tests can fail systematically using local servers. This error is not reproducible on all local servers and does not happen on the CI node performing the integration tests.
Full stack trace of the pytest error (with extra logging) is provided below:
======================================================================== test session starts ======================================================================== platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2 -- /usr/local/opt/python/bin/python2.7 collected 1 items test/integration/scriptstest/test_inputs.py:98: TestInputs.testInputs FAILED ============================================================================= FAILURES ============================================================================== _______________________________________________________________________ TestInputs.testInputs _______________________________________________________________________ self = <test_inputs.TestInputs object at 0x10513b550> def testInputs(self): import logging logging.basicConfig(level=10) scripts = self.root.getSession().getScriptService() sendfile = SENDFILE % self.omeropydir() id = scripts.uploadScript( "/tests/inputs_py/%s.py" % self.uuid(), sendfile) input = {"a": rint(100)} impl = omero.processor.usermode_processor(self.root) try: process = scripts.runScript(id, input, None) cb = omero.scripts.ProcessCallbackI(self.root, process) try: count = 100 while cb.block(500): count -= 1 assert count != 0 rc = process.poll() > results = process.getResults(0) test/integration/scriptstest/test_inputs.py:116: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = ProcessCallback/f6f8e0a7-20c0-43d9-b67b-3aaac69c0b35 -t -e 1.0:tcp -h 129.215.237.135 -p 53526, waitSecs = 0, _ctx = None def getResults(self, waitSecs, _ctx=None): > return _M_omero.grid.ScriptProcess._op_getResults.invoke(self, ((waitSecs, ), _ctx)) '''Returns the results immediately if present. If the process is not yet finished, waits "waitSecs" before throwing an [omero.ApiUsageException]. If poll has returned a non-null E ApiUsageException: exception ::omero::ApiUsageException E { E serverStackTrace = E serverExceptionClass = E message = Process still running. E } target/omero_Scripts_ice.py:929: ApiUsageException -------------------------------------------------------------------------- Captured stderr -------------------------------------------------------------------------- DEBUG:omero.processor.ProcessorI:Created INFO:omero.util.Resources:Starting DEBUG:omero.util.Resources:Adding object <omero.processor.UseSessionHolder object at 0x104c46f90> DEBUG:omero.util.Resources:Executing INFO:omero.processor.ProcessorI:Registering processor 67\"C-%T(pB}Q0!0n4;ij/UsermodeProcessor-86c0ea3c-fa5b-4dd5-822e-5ff5c02f3656 -t -e 1.0:tcp -h 127.0.0.1 -p 53489 DEBUG:omero.util.Resources:Checking <omero.processor.UseSessionHolder object at 0x104c46f90> DEBUG:omero.util.Resources:Sleeping 60 INFO:omero.remote: Meth: ProcessorI.willAccept DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:2 - Valid: True INFO:omero.remote: Rslt: True INFO:omero.remote: Meth: ProcessorI.willAccept DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:3 - Valid: True INFO:omero.remote: Rslt: True INFO:omero.remote: Meth: ProcessorI.parseJob INFO:omero.processor.ProcessorI:parseJob: Session = 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c, JobId = 3 INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7 INFO:omero.processor.ProcessorI:Using process: DEBUG:omero.processor.ProcessI:Created DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir INFO:omero.processor.ProcessI:Created 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c in /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> INFO:omero.processor.ProcessorI:Downloaded file: 32 INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Activated DEBUG:omero.perf:start[1400841236783] time[95] tag[omero.processor.ProcessorI.process] INFO:omero.remote: Meth: ProcessI.wait INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Waiting INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Deactivating INFO:omero.remote: Meth: ProcessI.shutdown INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841237457] time[0] tag[omero.processor.ProcessI.shutdown] DEBUG:omero.processor.ProcessI:client: session-5501fdb3-d4ae-4269-8fa2-6eacbf6d44a0/6e8ce0fe-1896-485a-a3f5-ec46a61bf16c -t -e 1.0 @ BlitzAdapters INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Changed job status from Running to Error INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : No stdout INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Uploaded 424 bytes of /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir/err to 33 DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Lived 0s. Deactivation took 0s. INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Callback processFinished INFO:omero.remote: Rslt: 1 DEBUG:omero.perf:start[1400841236879] time[766] tag[omero.processor.ProcessI.wait] INFO:omero.remote: Rslt: object #0 (::omero::grid::JobParams) { name = test_inputs.py version = description = contact = au INFO:omero.remote: Meth: ProcessorI.processJob INFO:omero.processor.ProcessorI:processJob: Session = b9980a13-f9d9-4881-a0aa-279527f6e461, JobId = 2 DEBUG:omero.processor.ProcessorI:Checking params for job 2 INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7 INFO:omero.processor.ProcessorI:Using process: DEBUG:omero.processor.ProcessI:Created DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir INFO:omero.processor.ProcessI:Created b9980a13-f9d9-4881-a0aa-279527f6e461 in /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> INFO:omero.processor.ProcessorI:Downloaded file: 32 INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Activated DEBUG:omero.perf:start[1400841237718] time[88] tag[omero.processor.ProcessorI.process] INFO:omero.remote: Rslt: 67\"C-%T(pB}Q0!0n4;ij/Process-9fd869ba-4d6e-4465-9059-3aaa9c7ee573 -t -e 1.0:tcp -h 127.0.0.1 -p 53489 INFO:omero.remote: Meth: ProcessI.registerCallback DEBUG:omero.processor.ProcessI:Added callback: ProcessCallback/888fdc98-08a8-495f-97cb-f8ec065ce4e8 INFO:omero.remote: Rslt: None INFO:omero.remote: Meth: ProcessI.registerCallback DEBUG:omero.processor.ProcessI:Added callback: 67"C-%T(pB}Q0!0n4;ij/7b0f9995-7a84-4526-9bb3-e85b5267f0d6 INFO:omero.remote: Rslt: None INFO:omero.remote: Meth: ProcessI.poll INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841237837] time[0] tag[omero.processor.ProcessI.poll] INFO:omero.remote: Meth: ProcessI.poll INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841238340] time[0] tag[omero.processor.ProcessI.poll] INFO:omero.remote: Meth: ProcessI.poll INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841238342] time[0] tag[omero.processor.ProcessI.poll] INFO:omero.remote: Meth: ProcessI.poll INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841238343] time[0] tag[omero.processor.ProcessI.poll] INFO:omero.processor.ProcessorI:Cleaning up DEBUG:omero.util.Resources:Cleaning <omero.processor.UseSessionHolder object at 0x104c46f90> DEBUG:omero.util.Resources:Cleaning <proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> DEBUG:omero.perf:start[1400841238346] time[0] tag[omero.processor.ProcessI.cleanup] DEBUG:omero.util.Resources:Cleaning <proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Deactivating INFO:omero.remote: Meth: ProcessI.shutdown INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Shutdown INFO:omero.remote: Meth: ProcessI.cancel INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM) INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238346] time[1] tag[omero.processor.ProcessI.cancel] WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 25 seconds. INFO:omero.remote: Meth: ProcessI.cancel INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM) INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel] WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 19 seconds. INFO:omero.remote: Meth: ProcessI.cancel INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM) INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel] WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 13 seconds. INFO:omero.remote: Meth: ProcessI.cancel INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM) INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel] WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 7 seconds. INFO:omero.remote: Meth: ProcessI.cancel INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM) INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel] WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 1 seconds. INFO:omero.remote: Meth: ProcessI.kill INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : popen.kill(True) DEBUG:omero.processor.ProcessI:err on pid=10394 iskill=True : [Errno 1] Operation not permitted INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processKilled INFO:omero.remote: Rslt: False DEBUG:omero.perf:start[1400841238350] time[0] tag[omero.processor.ProcessI.kill] INFO:omero.remote: Rslt: None DEBUG:omero.perf:start[1400841238346] time[4] tag[omero.processor.ProcessI.shutdown] DEBUG:omero.processor.ProcessI:client: session-3760fe3d-0d73-42d8-96f0-b4b7fa46110f/b9980a13-f9d9-4881-a0aa-279527f6e461 -t -e 1.0 @ BlitzAdapters INFO:omero.util.Resources:Halted INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Changed job status from Running to Cancelled INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stdout INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stderr DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Lived 0s. Deactivation took 0s. DEBUG:omero.processor.ProcessI:Can't get session for cleanup DEBUG:omero.perf:start[1400841238346] time[92] tag[omero.processor.ProcessI.cleanup] DEBUG:omero.util.Resources:Cleanup done INFO:omero.processor.ProcessorI:Done ===================================================================== 1 failed in 14.01 seconds ===================================================================== sebastien@sce-bio-c03722:OmeroPy ((71bbbc7...)) $
Change History (10)
comment:1 Changed 10 years ago by jamoore
comment:2 Changed 10 years ago by jamoore
Another error in the same test, though this time more of a traditional "timeout": http://ci.openmicroscopy.org/view/Failing/job/OMERO-5.0-merge-integration-python/261/testReport/test.integration.scriptstest.test_inputs/TestInputs/testInputs/
I'm assuming it will also be transient.
comment:3 Changed 10 years ago by mtbcarroll
https://github.com/openmicroscopy/openmicroscopy/pull/2744 adds an xfail to it.
comment:4 Changed 9 years ago by jamoore
- Milestone changed from 5.1.0-m4 to 5.1.1
comment:5 Changed 9 years ago by jamoore
- Cc cblackburn added; bpindelski removed
Thoughts on pushing?
comment:7 Changed 9 years ago by jamoore
- Milestone changed from 5.1.4 to OMERO-5.1.4
Splitting 5.1.4 due to milestone decoupling
comment:8 Changed 9 years ago by jburel
- Milestone changed from OMERO-5.1.4 to 5.x
comment:9 Changed 8 years ago by jamoore
- Milestone changed from 5.x to Unscheduled
comment:10 Changed 8 years ago by sbesson
- Resolution set to duplicate
- Status changed from new to closed
As mentioned via chat, this may be an issue of the check process.poll() return non-null in one context but not in another due to brief concurrency issues. Note: server-side we wrap the Python ProcessPrx with a ScriptProcessPrx. The two may need to be better synchronized or allow for some delay.