Task #10304 (new)
Opened 11 years ago
Last modified 8 years ago
Bug: Original metadata is unquoted and not safe to parse
Reported by: | rleigh | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | Metadata |
Component: | OmeroPy | Version: | 4.4.9 |
Keywords: | n.a. | Cc: | jburel, florian.leitner@…, java@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
Affects 4.4+
Original metadata can potentially contain both = symbols (in both key and value) and can also contain newlines. Currently, when retrieving the original metadata, for example with:
<XMLAnnotation ID="Annotation:65"> <Value> <OriginalMetadata xmlns="openmicroscopy.org/OriginalMetadata"> <Key>test User-Comment[0]</Key> <Value>Test test test test Entry entry entry Test test test test</Value> </OriginalMetadata> </Value> </XMLAnnotation>
this would result in
test User-Comment[0]=Test test test test Entry entry entry Test test test test
With the current server and gateway code, there is no quoting, so the second two lines will not be associated with the first.
It would be better if the data provided by the server was properly quoted so that the data is unabiguous and parsing can be robust. This could include:
- surrounding key and value with quotes (")
- allowing quotes inside keys and values with escaping (\")
- allowing escaping of escaping (
)
The server can encode keys and values using this simple scheme. And the client/gateway code can decode it before presenting it to the user. Keys and values are between unquoted quotes. And the escaping can be stripped from the separated keys and values.
Regards,
Roger
Change History (16)
comment:1 Changed 11 years ago by rleigh
comment:2 Changed 10 years ago by jamoore
- Cc jburel florian.leitner@… java@… added; jburel florian.leitner@… removed
- Milestone changed from Unscheduled to 5.0.0-beta2
- Version set to 4.4.9
Mark: could you look at this in slot it into other remaining "original metadata" work. This may not longer be an issue with OMERO5 in which case we'll need to decide whether or not we will backport any fixes to OMERO4.
comment:3 Changed 10 years ago by mtbcarroll
Another difficult example is the value for Recording #1 Notes from zeiss-lsm/sample files.mdb/sample files.mdb [XY-ch].
Does the OMERO 5 server still issue these text-format key-value lists (beyond the original_metadata.txt generated in a pre-upgrade import)?
comment:5 Changed 10 years ago by mlinkert
Easiest is probably to add to this:
to remove and/or escape '=', '\n', '\r', ', and?'. Then everything received by the server should be valid and parseable. Are there any other characters that you all have encountered that have caused a problem?
comment:6 Changed 10 years ago by mlinkert
Roger or Mark, did either of you have a test case that I can use to make sure I'm on the right track here (e.g. import a specific file and click a specific button in Insight)?
comment:7 Changed 10 years ago by rleigh
Hmm, looking back the problem was originally identified when adding support for LIF User-Comment fields, so it's quite possible that the LIF file(s) containing User-Comment may produce "unsafe" original metadata. But I can't recall offhand which LIF files contained the test comments. However, it may well be sufficient just to add the example annotation above to an OME-TIFF and import that?
comment:8 Changed 10 years ago by mtbcarroll
Melissa, data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb has some alarming stuff for the value of Recording #1 Notes key of sample files.mdb [XY-ch]. Also, I believe that data_repo/test_images_good/dv/mt1_R3D_D3D.dv has a key named Min, Max, Mean (w=617.0 nm). (So, both those keys have a "special" character.)
comment:9 Changed 10 years ago by mlinkert
- Owner mlinkert deleted
https://github.com/openmicroscopy/bioformats/pull/821 opened to fix this going forward. Any data already imported will still have this problem, though - that will likely need to be sorted out in OMERO.
comment:10 Changed 10 years ago by jamoore
- Priority changed from major to critical
Leaving the PR 821 part of this ticket for beta2 and pushing the rest of the work to beta3.
comment:11 Changed 10 years ago by jamoore
- Milestone changed from 5.0.0-beta2 to 5.0.0-beta3
(forgot to change milestone)
comment:12 Changed 9 years ago by jamoore
- Milestone changed from 5.1.0 to 5.1.1
comment:13 Changed 9 years ago by jburel
- Milestone changed from 5.1.1 to 5.1.3
Moving to 5.1.3 when we work on import and annotation support
comment:14 Changed 9 years ago by jamoore
- Milestone changed from 5.1.4 to OMERO-5.1.4
Splitting 5.1.4 due to milestone decoupling
comment:15 Changed 9 years ago by sbesson
- Milestone changed from OMERO-5.1.4 to 5.x
As per discussion today, pushing to 5.x
comment:16 Changed 8 years ago by jamoore
- Milestone changed from 5.x to Metadata
The escaping of escaping which trac mangled was just two backslashes.