Task #12314 (closed)
Opened 5 years ago
Closed 3 years ago
Scripts test test_inputs fails locally with 'Process still running'
| Reported by: | sbesson | Owned by: | |
|---|---|---|---|
| Priority: | minor | Milestone: | Unscheduled |
| Component: | Scripting | Version: | 5.0.1 |
| Keywords: | n.a. | Cc: | cblackburn, jamoore, spli |
| Resources: | n.a. | Referenced By: | n.a. |
| References: | n.a. | Remaining Time: | n.a. |
| Sprint: | n.a. |
Description
As reported initially by Blazej in https://github.com/openmicroscopy/openmicroscopy/pull/2528#issuecomment-43868379 and reconfirmed by myself while testing, the test_inputs.py integration tests can fail systematically using local servers. This error is not reproducible on all local servers and does not happen on the CI node performing the integration tests.
Full stack trace of the pytest error (with extra logging) is provided below:
======================================================================== test session starts ========================================================================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2 -- /usr/local/opt/python/bin/python2.7
collected 1 items
test/integration/scriptstest/test_inputs.py:98: TestInputs.testInputs FAILED
============================================================================= FAILURES ==============================================================================
_______________________________________________________________________ TestInputs.testInputs _______________________________________________________________________
self = <test_inputs.TestInputs object at 0x10513b550>
def testInputs(self):
import logging
logging.basicConfig(level=10)
scripts = self.root.getSession().getScriptService()
sendfile = SENDFILE % self.omeropydir()
id = scripts.uploadScript(
"/tests/inputs_py/%s.py" % self.uuid(), sendfile)
input = {"a": rint(100)}
impl = omero.processor.usermode_processor(self.root)
try:
process = scripts.runScript(id, input, None)
cb = omero.scripts.ProcessCallbackI(self.root, process)
try:
count = 100
while cb.block(500):
count -= 1
assert count != 0
rc = process.poll()
> results = process.getResults(0)
test/integration/scriptstest/test_inputs.py:116:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = ProcessCallback/f6f8e0a7-20c0-43d9-b67b-3aaac69c0b35 -t -e 1.0:tcp -h 129.215.237.135 -p 53526, waitSecs = 0, _ctx = None
def getResults(self, waitSecs, _ctx=None):
> return _M_omero.grid.ScriptProcess._op_getResults.invoke(self, ((waitSecs, ), _ctx))
'''Returns the results immediately if present. If the process
is not yet finished, waits "waitSecs" before throwing an
[omero.ApiUsageException]. If poll has returned a non-null
E ApiUsageException: exception ::omero::ApiUsageException
E {
E serverStackTrace =
E serverExceptionClass =
E message = Process still running.
E }
target/omero_Scripts_ice.py:929: ApiUsageException
-------------------------------------------------------------------------- Captured stderr --------------------------------------------------------------------------
DEBUG:omero.processor.ProcessorI:Created
INFO:omero.util.Resources:Starting
DEBUG:omero.util.Resources:Adding object <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Executing
INFO:omero.processor.ProcessorI:Registering processor 67\"C-%T(pB}Q0!0n4;ij/UsermodeProcessor-86c0ea3c-fa5b-4dd5-822e-5ff5c02f3656 -t -e 1.0:tcp -h 127.0.0.1 -p 53489
DEBUG:omero.util.Resources:Checking <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Sleeping 60
INFO:omero.remote: Meth: ProcessorI.willAccept
DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:2 - Valid: True
INFO:omero.remote: Rslt: True
INFO:omero.remote: Meth: ProcessorI.willAccept
DEBUG:omero.processor.ProcessorI:Accepts called on: user:0 group:0 scriptjob:3 - Valid: True
INFO:omero.remote: Rslt: True
INFO:omero.remote: Meth: ProcessorI.parseJob
INFO:omero.processor.ProcessorI:parseJob: Session = 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c, JobId = 3
INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7
INFO:omero.processor.ProcessorI:Using process:
DEBUG:omero.processor.ProcessI:Created
DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
INFO:omero.processor.ProcessI:Created 6e8ce0fe-1896-485a-a3f5-ec46a61bf16c in /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c>
INFO:omero.processor.ProcessorI:Downloaded file: 32
INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Activated
DEBUG:omero.perf:start[1400841236783] time[95] tag[omero.processor.ProcessorI.process]
INFO:omero.remote: Meth: ProcessI.wait
INFO:omero.processor.ProcessI:<proc:10393,rc=-,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Waiting
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Deactivating
INFO:omero.remote: Meth: ProcessI.shutdown
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841237457] time[0] tag[omero.processor.ProcessI.shutdown]
DEBUG:omero.processor.ProcessI:client: session-5501fdb3-d4ae-4269-8fa2-6eacbf6d44a0/6e8ce0fe-1896-485a-a3f5-ec46a61bf16c -t -e 1.0 @ BlitzAdapters
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Changed job status from Running to Error
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : No stdout
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Uploaded 424 bytes of /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir/err to 33
DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processuLszem.dir
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Lived 0s. Deactivation took 0s.
INFO:omero.processor.ProcessI:<proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c> : Callback processFinished
INFO:omero.remote: Rslt: 1
DEBUG:omero.perf:start[1400841236879] time[766] tag[omero.processor.ProcessI.wait]
INFO:omero.remote: Rslt: object #0 (::omero::grid::JobParams)
{
name = test_inputs.py
version =
description =
contact =
au
INFO:omero.remote: Meth: ProcessorI.processJob
INFO:omero.processor.ProcessorI:processJob: Session = b9980a13-f9d9-4881-a0aa-279527f6e461, JobId = 2
DEBUG:omero.processor.ProcessorI:Checking params for job 2
INFO:omero.processor.ProcessorI:Using launcher: /usr/local/opt/python/bin/python2.7
INFO:omero.processor.ProcessorI:Using process:
DEBUG:omero.processor.ProcessI:Created
DEBUG:omero.util.TempFileManager:Added folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
INFO:omero.processor.ProcessI:Created b9980a13-f9d9-4881-a0aa-279527f6e461 in /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
DEBUG:omero.util.Resources:Adding object <proc:None,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461>
INFO:omero.processor.ProcessorI:Downloaded file: 32
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Activated
DEBUG:omero.perf:start[1400841237718] time[88] tag[omero.processor.ProcessorI.process]
INFO:omero.remote: Rslt: 67\"C-%T(pB}Q0!0n4;ij/Process-9fd869ba-4d6e-4465-9059-3aaa9c7ee573 -t -e 1.0:tcp -h 127.0.0.1 -p 53489
INFO:omero.remote: Meth: ProcessI.registerCallback
DEBUG:omero.processor.ProcessI:Added callback: ProcessCallback/888fdc98-08a8-495f-97cb-f8ec065ce4e8
INFO:omero.remote: Rslt: None
INFO:omero.remote: Meth: ProcessI.registerCallback
DEBUG:omero.processor.ProcessI:Added callback: 67"C-%T(pB}Q0!0n4;ij/7b0f9995-7a84-4526-9bb3-e85b5267f0d6
INFO:omero.remote: Rslt: None
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841237837] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238340] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238342] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.remote: Meth: ProcessI.poll
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Polling
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238343] time[0] tag[omero.processor.ProcessI.poll]
INFO:omero.processor.ProcessorI:Cleaning up
DEBUG:omero.util.Resources:Cleaning <omero.processor.UseSessionHolder object at 0x104c46f90>
DEBUG:omero.util.Resources:Cleaning <proc:10393,rc=1,uuid=6e8ce0fe-1896-485a-a3f5-ec46a61bf16c>
DEBUG:omero.perf:start[1400841238346] time[0] tag[omero.processor.ProcessI.cleanup]
DEBUG:omero.util.Resources:Cleaning <proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461>
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Deactivating
INFO:omero.remote: Meth: ProcessI.shutdown
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Shutdown
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238346] time[1] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 25 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 19 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238348] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 13 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 7 seconds.
INFO:omero.remote: Meth: ProcessI.cancel
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : os.kill(TERM)
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processCancelled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238349] time[0] tag[omero.processor.ProcessI.cancel]
WARNING:omero.processor.ProcessI:Shutdown: 10394 (b9980a13-f9d9-4881-a0aa-279527f6e461). Killing in 1 seconds.
INFO:omero.remote: Meth: ProcessI.kill
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : popen.kill(True)
DEBUG:omero.processor.ProcessI:err on pid=10394 iskill=True : [Errno 1] Operation not permitted
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Callback processKilled
INFO:omero.remote: Rslt: False
DEBUG:omero.perf:start[1400841238350] time[0] tag[omero.processor.ProcessI.kill]
INFO:omero.remote: Rslt: None
DEBUG:omero.perf:start[1400841238346] time[4] tag[omero.processor.ProcessI.shutdown]
DEBUG:omero.processor.ProcessI:client: session-3760fe3d-0d73-42d8-96f0-b4b7fa46110f/b9980a13-f9d9-4881-a0aa-279527f6e461 -t -e 1.0 @ BlitzAdapters
INFO:omero.util.Resources:Halted
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Changed job status from Running to Cancelled
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stdout
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : No stderr
DEBUG:omero.util.TempFileManager:Removed folder /Users/sebastien/omero/tmp/omero_sebastien/10387/processCWSuKp.dir
INFO:omero.processor.ProcessI:<proc:10394,rc=-,uuid=b9980a13-f9d9-4881-a0aa-279527f6e461> : Lived 0s. Deactivation took 0s.
DEBUG:omero.processor.ProcessI:Can't get session for cleanup
DEBUG:omero.perf:start[1400841238346] time[92] tag[omero.processor.ProcessI.cleanup]
DEBUG:omero.util.Resources:Cleanup done
INFO:omero.processor.ProcessorI:Done
===================================================================== 1 failed in 14.01 seconds =====================================================================
sebastien@sce-bio-c03722:OmeroPy ((71bbbc7...)) $
Change History (10)
comment:1 Changed 5 years ago by jamoore
comment:2 Changed 5 years ago by jamoore
Another error in the same test, though this time more of a traditional "timeout": http://ci.openmicroscopy.org/view/Failing/job/OMERO-5.0-merge-integration-python/261/testReport/test.integration.scriptstest.test_inputs/TestInputs/testInputs/
I'm assuming it will also be transient.
comment:3 Changed 5 years ago by mtbcarroll
https://github.com/openmicroscopy/openmicroscopy/pull/2744 adds an xfail to it.
comment:4 Changed 5 years ago by jamoore
- Milestone changed from 5.1.0-m4 to 5.1.1
comment:5 Changed 4 years ago by jamoore
- Cc cblackburn added; bpindelski removed
Thoughts on pushing?
comment:7 Changed 4 years ago by jamoore
- Milestone changed from 5.1.4 to OMERO-5.1.4
Splitting 5.1.4 due to milestone decoupling
comment:8 Changed 4 years ago by jburel
- Milestone changed from OMERO-5.1.4 to 5.x
comment:9 Changed 3 years ago by jamoore
- Milestone changed from 5.x to Unscheduled
comment:10 Changed 3 years ago by sbesson
- Resolution set to duplicate
- Status changed from new to closed
As mentioned via chat, this may be an issue of the check process.poll() return non-null in one context but not in another due to brief concurrency issues. Note: server-side we wrap the Python ProcessPrx with a ScriptProcessPrx. The two may need to be better synchronized or allow for some delay.