Task #11728 (new)
Opened 6 years ago
Last modified 3 years ago
Add hash calculation to HDF tables
| Reported by: | spli | Owned by: | spli |
|---|---|---|---|
| Priority: | minor | Milestone: | Metadata |
| Component: | Services | Version: | 5.0.0-beta1 |
| Keywords: | n.a. | Cc: | analysis@…, mtbcarroll, cxallan |
| Resources: | n.a. | Referenced By: | n.a. |
| References: | n.a. | Remaining Time: | n.a. |
| Sprint: | n.a. |
Description
Checksums aren't calculated for the HDF5 OriginalFile? backing an OMERO.table, see #11697. Should they be, bearing in mind the potential slowdown when dealing with large tables?
Change History (9)
comment:1 Changed 6 years ago by jamoore
comment:2 Changed 6 years ago by jburel
- Milestone changed from 5.0.0-beta2 to 5.0.0-beta3
comment:3 Changed 5 years ago by mtbcarroll
A file size hash is trivial and fast and better than none at all. I can see that a "proper" hash may not be desirable for large files that the user may frequently change. (I think the raw file store save() method automatically re-hashes based on the algorithm set for the original file.)
One idea would be for changes to large files to null the hash and #10765 to calculate a new one later, without delaying the client.
comment:4 Changed 5 years ago by jamoore
Note that with https://github.com/openmicroscopy/openmicroscopy/pull/3335/files#diff-9761312afcf670830f84842fc3c69a89R196 there's a better chance that the hashing will take place, assuming some hasher has been set.
comment:5 Changed 4 years ago by jamoore
- Milestone changed from 5.1.4 to OMERO-5.1.4
Splitting 5.1.4 due to milestone decoupling
comment:6 Changed 4 years ago by jburel
- Milestone changed from OMERO-5.1.4 to OMERO-5.2.0
comment:7 Changed 4 years ago by jburel
- Milestone changed from OMERO-5.2.1 to OMERO-5.2.2
Milestone OMERO-5.2.1 deleted
comment:8 Changed 4 years ago by jburel
- Milestone changed from OMERO-5.2.2 to OMERO-5.2.1
Milestone OMERO-5.2.2 deleted
comment:9 Changed 3 years ago by jburel
- Milestone changed from OMERO-5.2.2 to Metadata
My vote is that everything you be hashed, but perhaps a candidate for one of the faster methods?