Requirement #2128 (new)
Opened 14 years ago
Last modified 11 years ago
FS Data de-duplication
Reported by: | jamoore | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | Unscheduled |
Component: | OmeroFs | Keywords: | paris2011 |
Cc: | cxallan, jburel, cblackburn, drussell-x | Business Value: | n.a. |
Total Story Points: | n.a. | Roif: | n.a. |
Mandatory Story Points: | n.a. |
Description (last modified by jmoore)
Goal
A major barrier to adoption by many sites is the fact that OMERO (4.2 and earlier, as well as much of 4.3) stores binary pixel data in an optimized internal format. This duplication of the data (the original file and then the byte array under /OMERO/Pixels) along with possible duplicated backups of that data can overwhelm available storage. The goal of this work is to phase out the use of the internal data format in favor of ccessing binary data directly from the files via BioFormats. Data previously converted into the interior file format will remain so, if the related original files were not archived.
Phase 0
Phase 0 is a carrier over from the big images work (#1950) in which some base work for FS ("FS-lite") was done. Initially, only single-file PFFs were supported. By adding support for multi-file PFFs, we can test the overall performance of our FS strategy and let users start archiving rather than parsing their data.
- #909 OriginalFiles? need an organizational structure
- #5640 FS lite for multi-file PFFs
- Client modifications...
Phase 1
The initial phase will focus on having a user view on their files as they sit on the filesystem. A prototype FS viewer is already in place in Insight (URL/SCREENSHOT HERE), and as of 4.3, the server is already directly accessing some files, like very wide & tall SVS files. (In some tickets this is referred to as "fs-lite")
Stories include:
- #909 OriginalFiles? need an organizational structure
- #978 CMS like view of files
- #2728 Accessing binary data
- #3162 Handle read-only directories mounted in FS
- #2307 Standardize repository paths on unix path-style
- #5305 Handle integration with legacy repository
- #2595 FS cleanup tasks
- #870 Original file service improvements
- #5158 FS/PixelData follow-on
- #6220 JPEG 2000 follow-on
Phase 2
The second phase, which may accompany the initial release, will focus on stability & accuracy, as well as any features which users request during phase 1 testing.
Stories include:
- #3509 Drivespace improvement
- #1740 Support for multiple FS File Server
- #988 File corruptions/immutability guarantees under OmeroFs
- #4033 Have one user import data for another.
Phase 3
There are several extended features that have been requested but which are not critical to the central goal of "data de-duplication". As time permits (and according to user-based priority), these will be added.
Stories include:
- #2850 User management of repositories
- #914 Support archiving to tape via OMERO.fs
- #836 Quotas for user
Phase N : Notification
An advanced feature of FS includes integration with native FS notifications. These notifications are the basis for the DropBox? feature, which has been been the sole FS-feature to date. If notifications are integrated into FS proper, then the DropBox? feature may be phased out.
Stories include:
- #4032 Allow Project/Datasets? for DropBox??
- #1445 Remote DropBox?
- #1230 RepositoryInfo? rework - Interface may be removed
- #1433 FS should provide a method getSupportEventTypes()
- #1571 FS should obtain parameters without duplication
- #5364 DropBox? cleanup
Related mail threads
Change History (11)
comment:1 Changed 13 years ago by jmoore
- Priority changed from minor to critical
comment:2 Changed 13 years ago by cblackburn
- Cc cxallan jburel cblackburn added
comment:3 Changed 13 years ago by jmoore
- Description modified (diff)
comment:4 Changed 13 years ago by jmoore
- Component set to OmeroFs
- Description modified (diff)
- Summary changed from FS Improvements to FS Data de-duplication
comment:5 Changed 13 years ago by jmoore
- Description modified (diff)
comment:6 Changed 13 years ago by jmoore
- Description modified (diff)
comment:7 Changed 12 years ago by jmoore
- Description modified (diff)
- Keywords paris2011 added
comment:8 Changed 12 years ago by cxallan
- Description modified (diff)
comment:9 Changed 12 years ago by jmoore
- Description modified (diff)
comment:10 Changed 11 years ago by drussell-x
- Cc drussell-x added
comment:11 Changed 11 years ago by jmoore
- Description modified (diff)
Adding phase 0 with #5640, fs-lite work, as discussed in Paris during the users' meeting. Need tickets with client modifications for #5640.