Task #10890 (closed)
Pre-process datasets in DoAll
Reported by: | jamoore | Owned by: | mtbcarroll |
---|---|---|---|
Priority: | critical | Milestone: | 5.0.0-beta1 |
Component: | Services | Version: | n.a. |
Keywords: | fs | Cc: | fs@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | 0.0d |
Sprint: | FS Demo 4.3 |
Description
As a follow-on to #10847, Datasets should also be pre-processed. This is markedly more complex then just pre-processing the Images since there are then 2 container hierarchies to take into account.
A possible solution might be to implement #10859 and have a query which takes an arbitrary listing of images, datasets, and projects and filters out all elements for which MIFs are "complete" (or "good").
The same query then could also be used client-side to remove the need for server-side pre-processing if it is determined to be too difficult to detect what the user's intent was.
See: https://github.com/openmicroscopy/openmicroscopy/pull/1142 for the image pre-processing implementation.
Change History (18)
comment:1 Changed 11 years ago by mtbcarroll
- Status changed from new to accepted
comment:2 Changed 11 years ago by mtbcarroll
comment:3 Changed 11 years ago by jamoore
Probably we'll want to prefer collapsing to fileset for all cases. i.e it becomes the most basic hierarchy we have, and we translate all the other ones (when dealing with MIFs) to it.
For SPW, almost certainly, but the special case there is that plates will (almost) always be a single MIF. But there is still the possibility of interactions with Datasets, etc.
comment:4 Changed 11 years ago by mtbcarroll
a query which takes an arbitrary listing of images, datasets, and projects and filters out all elements for which MIFs are "complete"
Perhaps separately from this ticket, this is certainly a query which could be written. Would a useful return value be the IDs of the filesets "split" by the given entities, as keys in a multimap whose values are the image IDs of the "missing" images that would need to be added to complete each fileset?
comment:5 Changed 11 years ago by wmoore
NB: see #10782 for more details of the MIF query.
comment:6 Changed 11 years ago by mtbcarroll
Okay, so the initial plan on this ticket is: where the listed projects, datasets, images are such that all the images of a fileset are covered, make sure that fileset precedes the first request that implicitly mentions anything from it, and then don't include the requests for the images that the fileset entails. (All within the context of a specific operation -- chgrp, delete, whatever.)
comment:7 Changed 11 years ago by jburel
- Sprint changed from FS demo 4.2 to FS Demo 4.3
Moved from sprint FS demo 4.2
comment:8 Changed 11 years ago by mtbcarroll
When this is fixed both integration.chgrp.TestChgrp.testChgrpAllImagesFilesetOK and integration.chgrp.TestChgrp.testChgrpAllDatasetsFilesetOK should be passing.
comment:9 Changed 11 years ago by mtbcarroll
So, testChgrpAllDatasetsFilesetOK creates a two-image fileset and puts one image in one dataset and the other in the other. My preprocessing code detects this and adds the fileset request before the two dataset requests in the list. For instance, in one run, we go from ChgrpI(/Dataset:101), ChgrpI(/Dataset:102) to ChgrpI(/Fileset:501), ChgrpI(/Dataset:101), ChgrpI(/Dataset:102), which accords with what's in the database. But, then there is a later failure with,
Failed to process Dataset/DatasetImageLink: 101 due to GraphException: No top-level item found: update DatasetImageLink set details.group.id = :grp where id = :id (id=101, grp=505)
Should the preprocessor be doing something different?
comment:10 Changed 11 years ago by jamoore
Running on gretzky2 just now, I get:
testChgrpAllDatasetsFilesetOK (integration.chgrp.TestChgrp) ... FAIL ====================================================================== FAIL: testChgrpAllDatasetsFilesetOK (integration.chgrp.TestChgrp) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/chgrp.py", line 308, in testChgrpAllDatasetsFilesetOK self.doAllSubmit([chgrp1,chgrp2], client) File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/library.py", line 649, in doAllSubmit rsp = self.doSubmit(da, client, test_should_pass=test_should_pass, omero_group=omero_group) File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/library.py", line 637, in doSubmit self.fail("Found ERR when test_should_pass==true: %s (%s) params=%s" % (rsp.category, rsp.name, rsp.parameters)) AssertionError: Found ERR when test_should_pass==true: ::omero::cmd::Chgrp (STEP ERR) params={'stacktrace': 'ome.services.graphs.GraphConstraintException(message=Fileset:95 improperly linked by 1 objects\n\tat ome.services.graphs.GraphStep.graphValidation(GraphStep.java:407)\n\tat ome.services.chgrp.ChgrpStep.action(ChgrpStep.java:98)\n\tat ome.services.graphs.GraphState.execute(GraphState.java:351)\n\tat omero.cmd.graphs.ChgrpI.step(ChgrpI.java:157)\n\tat omero.cmd.basic.DoAllI$X.step(DoAllI.java:107)\n\tat omero.cmd.basic.DoAllI.step(DoAllI.java:319)\n\tat omero.cmd.HandleI.steps(HandleI.java:435)\n\tat omero.cmd.HandleI$1.doWork(HandleI.java:365)\n\tat omero.cmd.HandleI$1.doWork(HandleI.java:363)\n\tat sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)\n\tat java.lang.reflect.Method.invoke(Method.java:592)\n\tat org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)\n\tat ome.services.util.Executor$Impl$Interceptor.invoke(Executor.java:518)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.security.basic.EventHandler.invoke(EventHandler.java:154)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.orm.hibernate3.HibernateInterceptor.invoke(HibernateInterceptor.java:111)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:108)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.tools.hibernate.ProxyCleanupFilter$Interceptor.invoke(ProxyCleanupFilter.java:241)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.services.util.ServiceHandler.invoke(ServiceHandler.java:116)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)\n\tat $Proxy66.doWork(Unknown Source)\n\tat ome.services.util.Executor$Impl.execute(Executor.java:416)\n\tat omero.cmd.HandleI.run(HandleI.java:359)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)\n\tat ome.services.util.Executor$Impl$1.call(Executor.java:447)\n\tat java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:123)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)\n\tat java.lang.Thread.run(Thread.java:595)\n', 'message': '', 'GraphConstraintException': 'true', 'step': '31', 'id': '66'} ---------------------------------------------------------------------- Ran 1 test in 48.492s FAILED (failures=1) Result: 1 Entering /opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy...
Does the result depend on how one executes it or against which server? Is it sporadic.
My gut feeling when seeing the error in comment 9 is that somehow the wrong ID is being passed in, but I certainly don't know enough yet to really tell.
comment:11 Changed 11 years ago by mtbcarroll
Okay, from a new build locally, testChgrpAllDatasetsFilesetOK means that the preprocessor gets the requests ChgrpI(/Dataset:1), ChgrpI(/Dataset:2) which it translates to ChgrpI(/Fileset:1), ChgrpI(/Dataset:1), ChgrpI(/Dataset:2).
omero=> select child from datasetimagelink where parent in (1,2); child ------- 1 2 (2 rows) omero=> select fileset from image where id in (1,2); fileset --------- 1 1 (2 rows) omero=> select id from image where fileset = 1; id ---- 2 1 (2 rows) omero=>
and each time I get,
Failed to process Dataset/DatasetImageLink: 1 due to GraphException: No top-level item found: update DatasetImageLink set details.group.id = :grp where id = :id (id=1, grp=5)
I could push my work-in-progress branch to github if you like, even also open a pull request?
comment:12 Changed 11 years ago by jburel <j.burel@…>
(In [7c210e90f9ef42f18c943e4fe9cbc6c2b75b4d4e/ome.git] on branch develop) Remove "Show thumbnails" option (see #10890)
comment:13 Changed 11 years ago by mtbcarroll
pushed working branch to https://github.com/mtbc/openmicroscopy/commits/trac-10890
comment:14 Changed 11 years ago by mtbcarroll
With the current merge build, in integration.chgrp.TestChgrp merging in this branch makes testChgrpAllDatasetsFilesetOK pass (yay!) but testChgrpImage then takes too long. So, I'll investigate that problem.
comment:15 Changed 11 years ago by mtbcarroll
Okay, fixed my testChgrpImage issue; will have to wait for https://github.com/openmicroscopy/openmicroscopy/pull/1214 to be merged before opening a pull request.
comment:16 Changed 11 years ago by mtbcarroll
- Resolution set to fixed
- Status changed from accepted to closed
comment:17 Changed 11 years ago by Mark Carroll <m.t.b.carroll@…>
- Remaining Time set to 0
(In [6bcbf539238667425721c79eb61f8147abf568ec/ome.git] on branch develop) fix #10890: extend blitz preprocessor to containers
comment:18 Changed 11 years ago by Josh Moore <josh@…>
(In [6b71b41cb40a1f3195f7d412713a5fe1b6db82d3/ome.git] on branch develop) Merge pull request #1235 from mtbc/trac-10890
fix #10890: extend blitz preprocessor to containers
So, we replace images with datasets if all the dataset's images are there, and datasets with projects if all the project's datasets are there? What about, say, if we start with a set of images that corresponds to both a dataset and a fileset? Is there any reason to prefer "collapsing" up one hierarchy rather than the other?
(Is there a parallel job here regarding screens, plates, wells?)