Task #11084 (closed)
Improve deletion of original files
Reported by: | cblackburn | Owned by: | mtbcarroll |
---|---|---|---|
Priority: | major | Milestone: | 5.1.0-m4 |
Component: | Services | Version: | n.a. |
Keywords: | n.a. | Cc: | fs@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | 0.0d |
Sprint: | n.a. |
Description
Currently original files are deleted one per transaction. This adds considerably to the delete time for plates. Changing so that file deletes are done as a batch should improve delete performance.
Change History (25)
comment:1 Changed 11 years ago by cblackburn
- Cc fs@… added
- Component changed from General to Services
- Priority changed from minor to major
comment:2 Changed 11 years ago by jburel
- Sprint changed from FS Demo 4.3 to FS demo 4.4
comment:3 Changed 11 years ago by mtbcarroll
- Owner changed from cblackburn to mtbcarroll
comment:4 Changed 11 years ago by jamoore
- Sprint changed from FS demo 4.4 to FS demo 4.x
Pushing to post Paris.
comment:5 Changed 11 years ago by mtbcarroll
- Sprint FS demo 4.x deleted
Josh: this looks relevant to code that you have worked on lately, any pointers?
comment:6 Changed 11 years ago by jamoore
Mark: if you're referring to the speed up of the import paths by reduction in transactions, then no real pointer other than to profile and make sure that's the actual cause. Most likely this requires some refactoring in order to get everything into a single call of Executor.execute which is what I did for import.
comment:7 Changed 10 years ago by mtbcarroll
How to determine that original files are deleted one per transaction, for plates or other cases of concern? It seems odd that this would be the case. Is the problem that a savepoint is created for each within the one transaction?
comment:8 Changed 10 years ago by jamoore
Certainly not a savepoint issue like graphs. More a RepositoryDao API issue.
comment:9 Changed 10 years ago by mtbcarroll
- Milestone changed from 5.x to 5.1.0
- Version set to 5.0.5
Promoted to 5.1.0 to match #10475.
(I do indeed notice with the graphs reimplementation that even after the graph traversal is all done, the actual deletion of the files is surprisingly slow.)
comment:10 Changed 10 years ago by mtbcarroll
- Version 5.0.5 deleted
No, I didn't set the Version, bah.
comment:11 Changed 10 years ago by mtbcarroll
I notice that SqlAction allows the _fs_deletelog table to be searched by any column other than params, yet the table is wholly unindexed.
comment:12 Changed 10 years ago by jamoore
Good catch. I'd propose I add these to my DB upgrade for m3.
comment:13 Changed 10 years ago by mtbcarroll
It could also be useful to allow DeleteLogMessage and DeleteLog to carry a set of IDs instead of just one.
comment:14 Changed 10 years ago by mtbcarroll
A look at the logs finds that during Delete2I deletion of a plate there is plenty of,
2014-11-24 14:07:14,885 INFO [ org.perf4j.TimingLogger] (2-thread-4) start[1416838034882] time[2] tag[omero.call.success.ome.services.blitz.repo.RepositoryDaoImpl$16.doWork] 2014-11-24 14:07:14,885 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Rslt: () 2014-11-24 14:07:14,885 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Executor.doWork -- ome.services.blitz.repo.RepositoryDaoImpl.findRepoDeleteLogs(DeleteLog<file=996repo=c9c14ba0-592a-4777-a0cf-6cf9e4215c58>) 2014-11-24 14:07:14,885 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Args: [null, InternalSF@1517296254] 2014-11-24 14:07:14,886 INFO [ ome.security.basic.EventHandler] (2-thread-4) Auth: user=0,group=0,event=null(Internal),sess=e8a4c338-c244-417f-b473-0b1c472c24b2 2014-11-24 14:07:14,888 INFO [ org.perf4j.TimingLogger] (2-thread-4) start[1416838034885] time[2] tag[omero.call.success.ome.services.blitz.repo.RepositoryDaoImpl$16.doWork] 2014-11-24 14:07:14,888 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Rslt: () 2014-11-24 14:07:14,888 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Executor.doWork -- ome.services.blitz.repo.RepositoryDaoImpl.findRepoDeleteLogs(DeleteLog<file=996repo=4fc7caa6-5ea5-4486-b86b-de704a6d106f>) 2014-11-24 14:07:14,888 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Args: [null, InternalSF@1517296254] 2014-11-24 14:07:14,889 INFO [ ome.security.basic.EventHandler] (2-thread-4) Auth: user=0,group=0,event=null(Internal),sess=e8a4c338-c244-417f-b473-0b1c472c24b2 2014-11-24 14:07:14,891 INFO [ org.perf4j.TimingLogger] (2-thread-4) start[1416838034888] time[2] tag[omero.call.success.ome.services.blitz.repo.RepositoryDaoImpl$16.doWork] 2014-11-24 14:07:14,891 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Rslt: (DeleteLog<event=19457file=996owner=239group=321path=user-2_239/2014-11/24/12-36-20.930/siRNA_PRIM1_03102008/001-365700055641/results/name=Well0001_mode1_z000_t000_detail_2008-09-18-10h48m54s.txtrepo=4fc7caa6-5ea5-4486-b86b-de704a6d106f>) 2014-11-24 14:07:14,891 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Executor.doWork -- ome.services.blitz.repo.RepositoryDaoImpl.deleteRepoDeleteLogs(DeleteLog<file=996repo=4fc7caa6-5ea5-4486-b86b-de704a6d106f>) 2014-11-24 14:07:14,891 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Args: [null, InternalSF@1517296254] 2014-11-24 14:07:14,903 INFO [ ome.security.basic.EventHandler] (2-thread-4) Auth: user=0,group=0,event=19585(Internal),sess=e8a4c338-c244-417f-b473-0b1c472c24b2 2014-11-24 14:07:14,907 INFO [ org.perf4j.TimingLogger] (2-thread-4) start[1416838034891] time[15] tag[omero.call.success.ome.services.blitz.repo.RepositoryDaoImpl$17.doWork] 2014-11-24 14:07:14,907 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Rslt: 1 2014-11-24 14:07:14,907 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Executor.doWork -- ome.services.blitz.repo.RepositoryDaoImpl.findRepoDeleteLogs(DeleteLog<file=996repo=ScriptRepo>) 2014-11-24 14:07:14,907 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Args: [null, InternalSF@1517296254] 2014-11-24 14:07:14,909 INFO [ ome.security.basic.EventHandler] (2-thread-4) Auth: user=0,group=0,event=null(Internal),sess=e8a4c338-c244-417f-b473-0b1c472c24b2 2014-11-24 14:07:14,911 INFO [ org.perf4j.TimingLogger] (2-thread-4) start[1416838034907] time[3] tag[omero.call.success.ome.services.blitz.repo.RepositoryDaoImpl$16.doWork] 2014-11-24 14:07:14,911 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Rslt: () 2014-11-24 14:07:14,911 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Executor.doWork -- ome.services.blitz.repo.RepositoryDaoImpl.findRepoDeleteLogs(DeleteLog<file=999repo=c9c14ba0-592a-4777-a0cf-6cf9e4215c58>) 2014-11-24 14:07:14,911 INFO [ ome.services.util.ServiceHandler] (2-thread-4) Args: [null, InternalSF@1517296254] 2014-11-24 14:07:14,913 INFO [ ome.security.basic.EventHandler] (2-thread-4) Auth: user=0,group=0,event=null(Internal),sess=e8a4c338-c244-417f-b473-0b1c472c24b2
comment:15 Changed 10 years ago by jamoore
- Milestone changed from 5.1.0 to 5.1.0-m3
- Owner changed from mtbcarroll to jamoore
I'll start looking at the db index side of this tomorrow.
comment:16 Changed 10 years ago by jamoore
The indexes don't seem to have a huge effect: https://github.com/openmicroscopy/openmicroscopy/pull/3216
After adding them, I started working on the number of transactions but haven't yet gotten to a point where I feel this is a MUST-INCLUDE.
comment:17 Changed 10 years ago by mtbcarroll
Thank you for the update; I guess maybe the table never gets appreciably large. I'll certainly be interested to learn if the transactions do make more of a difference, or if you gain clues as to the slowness' principal causes.
comment:18 Changed 10 years ago by mtbcarroll
I added an index to originalfile.path in case that speeds up _fs_dir_delete() usefully.
comment:19 Changed 10 years ago by jamoore
- Owner changed from jamoore to mtbcarroll
comment:20 Changed 10 years ago by mtbcarroll
Josh: Where's your branch and what more needs done?
comment:21 Changed 10 years ago by jamoore
I'll assume you're again not getting trello notifications :) -- https://github.com/joshmoore/openmicroscopy/tree/fs-delete-logs
comment:22 Changed 10 years ago by mtbcarroll
- Status changed from new to accepted
comment:23 Changed 10 years ago by mtbcarroll
- Resolution set to fixed
- Status changed from accepted to closed
comment:24 Changed 10 years ago by jmoore <josh@…>
- Remaining Time set to 0
(In [3682570ced474c64bd6597faad46101e5f9c7ebc/ome.git] on branch develop) Reduce transactions for file deletion (Fix #11084)
Looping through large numbers of files took significant
time due to the number of transactions involved. This
now wraps all the DeleteLogMessages? into a single
DeleteLogsMessage? which can be handled in bulk.
comment:25 Changed 10 years ago by Josh Moore <josh@…>
(In [568a8f76fd1e902841f15224b65b89ce88f9b062/ome.git] on branch develop) Merge pull request #3399 from mtbc/fs-delete-logs
fix #11084: combine original file deletions into fewer transactions
Moved from sprint FS Demo 4.3