Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.
Notice: In order to edit this ticket you need to be either: a Product Owner, The owner or the reporter of the ticket, or, in case of a Task not yet assigned, a team_member"

Task #10890 (closed)

Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

Pre-process datasets in DoAll

Reported by: jamoore Owned by: mtbcarroll
Priority: critical Milestone: 5.0.0-beta1
Component: Services Version: n.a.
Keywords: fs Cc: fs@…
Resources: n.a. Referenced By: n.a.
References: n.a. Remaining Time: 0.0d
Sprint: FS Demo 4.3

Description

As a follow-on to #10847, Datasets should also be pre-processed. This is markedly more complex then just pre-processing the Images since there are then 2 container hierarchies to take into account.

A possible solution might be to implement #10859 and have a query which takes an arbitrary listing of images, datasets, and projects and filters out all elements for which MIFs are "complete" (or "good").

The same query then could also be used client-side to remove the need for server-side pre-processing if it is determined to be too difficult to detect what the user's intent was.

See: https://github.com/openmicroscopy/openmicroscopy/pull/1142 for the image pre-processing implementation.

Change History (18)

comment:1 Changed 11 years ago by mtbcarroll

  • Status changed from new to accepted

comment:2 Changed 11 years ago by mtbcarroll

So, we replace images with datasets if all the dataset's images are there, and datasets with projects if all the project's datasets are there? What about, say, if we start with a set of images that corresponds to both a dataset and a fileset? Is there any reason to prefer "collapsing" up one hierarchy rather than the other?

(Is there a parallel job here regarding screens, plates, wells?)

Last edited 11 years ago by mtbcarroll (previous) (diff)

comment:3 Changed 11 years ago by jamoore

Probably we'll want to prefer collapsing to fileset for all cases. i.e it becomes the most basic hierarchy we have, and we translate all the other ones (when dealing with MIFs) to it.

For SPW, almost certainly, but the special case there is that plates will (almost) always be a single MIF. But there is still the possibility of interactions with Datasets, etc.

comment:4 Changed 11 years ago by mtbcarroll

a query which takes an arbitrary listing of images, datasets, and projects and filters out all elements for which MIFs are "complete"

Perhaps separately from this ticket, this is certainly a query which could be written. Would a useful return value be the IDs of the filesets "split" by the given entities, as keys in a multimap whose values are the image IDs of the "missing" images that would need to be added to complete each fileset?

comment:5 Changed 11 years ago by wmoore

NB: see #10782 for more details of the MIF query.

comment:6 Changed 11 years ago by mtbcarroll

Okay, so the initial plan on this ticket is: where the listed projects, datasets, images are such that all the images of a fileset are covered, make sure that fileset precedes the first request that implicitly mentions anything from it, and then don't include the requests for the images that the fileset entails. (All within the context of a specific operation -- chgrp, delete, whatever.)

comment:7 Changed 11 years ago by jburel

  • Sprint changed from FS demo 4.2 to FS Demo 4.3

Moved from sprint FS demo 4.2

comment:8 Changed 11 years ago by mtbcarroll

When this is fixed both integration.chgrp.TestChgrp.testChgrpAllImagesFilesetOK and integration.chgrp.TestChgrp.testChgrpAllDatasetsFilesetOK should be passing.

comment:9 Changed 11 years ago by mtbcarroll

So, testChgrpAllDatasetsFilesetOK creates a two-image fileset and puts one image in one dataset and the other in the other. My preprocessing code detects this and adds the fileset request before the two dataset requests in the list. For instance, in one run, we go from ChgrpI(/Dataset:101), ChgrpI(/Dataset:102) to ChgrpI(/Fileset:501), ChgrpI(/Dataset:101), ChgrpI(/Dataset:102), which accords with what's in the database. But, then there is a later failure with,

Failed to process Dataset/DatasetImageLink: 101 due to GraphException: No top-level item found: update DatasetImageLink  set details.group.id = :grp where id = :id  (id=101, grp=505)

Should the preprocessor be doing something different?

comment:10 Changed 11 years ago by jamoore

Running on gretzky2 just now, I get:

testChgrpAllDatasetsFilesetOK (integration.chgrp.TestChgrp) ... FAIL
======================================================================
FAIL: testChgrpAllDatasetsFilesetOK (integration.chgrp.TestChgrp)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/chgrp.py", line 308, in testChgrpAllDatasetsFilesetOK
    self.doAllSubmit([chgrp1,chgrp2], client)
  File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/library.py", line 649, in doAllSubmit
    rsp = self.doSubmit(da, client, test_should_pass=test_should_pass, omero_group=omero_group)
  File "/opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy/test/integration/library.py", line 637, in doSubmit
    self.fail("Found ERR when test_should_pass==true: %s (%s) params=%s" % (rsp.category, rsp.name, rsp.parameters))
AssertionError: Found ERR when test_should_pass==true: ::omero::cmd::Chgrp (STEP ERR) params={'stacktrace': 'ome.services.graphs.GraphConstraintException(message=Fileset:95 improperly linked by 1 objects\n\tat ome.services.graphs.GraphStep.graphValidation(GraphStep.java:407)\n\tat ome.services.chgrp.ChgrpStep.action(ChgrpStep.java:98)\n\tat ome.services.graphs.GraphState.execute(GraphState.java:351)\n\tat omero.cmd.graphs.ChgrpI.step(ChgrpI.java:157)\n\tat omero.cmd.basic.DoAllI$X.step(DoAllI.java:107)\n\tat omero.cmd.basic.DoAllI.step(DoAllI.java:319)\n\tat omero.cmd.HandleI.steps(HandleI.java:435)\n\tat omero.cmd.HandleI$1.doWork(HandleI.java:365)\n\tat omero.cmd.HandleI$1.doWork(HandleI.java:363)\n\tat sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)\n\tat java.lang.reflect.Method.invoke(Method.java:592)\n\tat org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)\n\tat ome.services.util.Executor$Impl$Interceptor.invoke(Executor.java:518)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.security.basic.EventHandler.invoke(EventHandler.java:154)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.orm.hibernate3.HibernateInterceptor.invoke(HibernateInterceptor.java:111)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:108)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.tools.hibernate.ProxyCleanupFilter$Interceptor.invoke(ProxyCleanupFilter.java:241)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat ome.services.util.ServiceHandler.invoke(ServiceHandler.java:116)\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)\n\tat org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)\n\tat $Proxy66.doWork(Unknown Source)\n\tat ome.services.util.Executor$Impl.execute(Executor.java:416)\n\tat omero.cmd.HandleI.run(HandleI.java:359)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)\n\tat ome.services.util.Executor$Impl$1.call(Executor.java:447)\n\tat java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:123)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)\n\tat java.lang.Thread.run(Thread.java:595)\n', 'message': '', 'GraphConstraintException': 'true', 'step': '31', 'id': '66'}
----------------------------------------------------------------------
Ran 1 test in 48.492s
FAILED (failures=1)
Result: 1
Entering /opt/hudson/workspace/OMERO-merge-develop2/src/components/tools/OmeroPy...

Does the result depend on how one executes it or against which server? Is it sporadic.

My gut feeling when seeing the error in comment 9 is that somehow the wrong ID is being passed in, but I certainly don't know enough yet to really tell.

comment:11 Changed 11 years ago by mtbcarroll

Okay, from a new build locally, testChgrpAllDatasetsFilesetOK means that the preprocessor gets the requests ChgrpI(/Dataset:1), ChgrpI(/Dataset:2) which it translates to ChgrpI(/Fileset:1), ChgrpI(/Dataset:1), ChgrpI(/Dataset:2).

omero=> select child from datasetimagelink where parent in (1,2);
 child 
-------
     1
     2
(2 rows)

omero=> select fileset from image where id in (1,2);
 fileset 
---------
       1
       1
(2 rows)

omero=> select id from image where fileset = 1;
 id 
----
  2
  1
(2 rows)

omero=> 

and each time I get,

Failed to process Dataset/DatasetImageLink: 1 due to GraphException: No top-level item found: update DatasetImageLink  set details.group.id = :grp where id = :id  (id=1, grp=5)

I could push my work-in-progress branch to github if you like, even also open a pull request?

comment:12 Changed 11 years ago by jburel <j.burel@…>

(In [7c210e90f9ef42f18c943e4fe9cbc6c2b75b4d4e/ome.git] on branch develop) Remove "Show thumbnails" option (see #10890)

comment:14 Changed 11 years ago by mtbcarroll

With the current merge build, in integration.chgrp.TestChgrp merging in this branch makes testChgrpAllDatasetsFilesetOK pass (yay!) but testChgrpImage then takes too long. So, I'll investigate that problem.

comment:15 Changed 11 years ago by mtbcarroll

Okay, fixed my testChgrpImage issue; will have to wait for https://github.com/openmicroscopy/openmicroscopy/pull/1214 to be merged before opening a pull request.

comment:16 Changed 11 years ago by mtbcarroll

  • Resolution set to fixed
  • Status changed from accepted to closed

comment:17 Changed 11 years ago by Mark Carroll <m.t.b.carroll@…>

  • Remaining Time set to 0

(In [6bcbf539238667425721c79eb61f8147abf568ec/ome.git] on branch develop) fix #10890: extend blitz preprocessor to containers

comment:18 Changed 11 years ago by Josh Moore <josh@…>

(In [6b71b41cb40a1f3195f7d412713a5fe1b6db82d3/ome.git] on branch develop) Merge pull request #1235 from mtbc/trac-10890

fix #10890: extend blitz preprocessor to containers

Note: See TracTickets for help on using tickets. You may also have a look at Agilo extensions to the ticket.

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.69048 sec.)

We're Hiring!