Task #11878 (accepted)
Opened 10 years ago
Last modified 8 years ago
XSD Modulo Schema
Reported by: | ajpatterson | Owned by: | rleigh |
---|---|---|---|
Priority: | major | Milestone: | Unscheduled |
Component: | Specification | Version: | n.a. |
Keywords: | schema | Cc: | jburel, rleigh, jamoore |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
Write an Modulo.xsd file that can be used to validate modulo annotations in OME-XML data.
Attachments (1)
Change History (15)
comment:1 Changed 10 years ago by ajpatterson
- Status changed from new to accepted
comment:2 Changed 10 years ago by ajpatterson
- Cc jburel rleigh jamoore added
comment:3 Changed 10 years ago by rleigh
comment:4 Changed 10 years ago by rleigh
It also occurs to me that adding modulo X and Y has some useful uses:
dimensionOrder="XxYy" dimensionSizes="4,512,2,512" planeDimensions="xy"
describes an image of 4×2 tiles of size 512×512 pixels. I.e. the major X and Y dimensions are the tile numbers and minor are the pixel counts within each tile. If the dimensions of individual image planes could be specified, then setting the plane dimensions to "xy" rather than the "XY" default we currently have would give us support for (contiguous) tiling in X and Y. Given a per-plane physical position it can even describe overlapping tiles.
comment:5 Changed 10 years ago by ajpatterson
Minor point, xml list are space separated not comma seperated.
comment:6 Changed 10 years ago by rleigh
OK, thanks! The separator isn't too important, just the concept.
Other parts which could be generalised:
PhysicalSizeA TimeIncrement
These are the "real" sizes of various dimensions. Could be generalised to
PhysicalSize="18.0,18.0,2.4,514" PhysicalUnits="µm,µm,µm,ms"
Though since it encompasses all dimensions, "Physical" may need rewording.
PositionA
Could likewise be generalised to
Positions="23.43,221.4,2"
again with the caveat that it might need a better name.
I think that covers all the essential bits of the model which are restricted to 5D.
comment:7 Changed 10 years ago by rleigh
I have attached a patch showing the type of schema change I'd like to make. Note does not include stage label but we might want to include that since it could include information about stage angle.
comment:8 Changed 10 years ago by rleigh
Further thought:
If we were to add an additional "subchannel" dimension (e.g. "s") this could also replace the "Interleaved" attribute and allow direct addressing of RGB, RGBA and BGR, BGRA etc. in chunky and planar formats without any need for special case logic since we can fully describe the pixel layout using the subchannel size. This would formalise the existing use of logical subchannels for RGB data, and removes the need for extra logic to handle RGB data.
dimensionOrder="XYZTCs" // chunky dimensionExtents="512 512 14 5 1 4" 512x512x14zx5t RGBA chunky subchannelOrder="RGBA" dimensionOrder="XYZTsC" // planar dimensionExtents="512 512 14 5 3 1" 512x512x14zx5t BGR planar subchannelOrder="BGR"
comment:9 Changed 10 years ago by jamoore
This is getting us into NDIM, no? Where's the line between making an XSD for what we have, and where we're taking this? Eventually, these changes would be in the main XSD and we wouldn't have a separate modulo schema, right?
comment:10 Changed 10 years ago by rleigh
Yes, this is getting towards NDIM. By adding these extra dimensions, it's no longer really "modulo"; these are real dimensions completely independent of Z/T/C, though they would still need an annotation to fully describe themselves. So I guess we might want to hold off on doing that.
However, we could make these changes independently of modulo, retaining the XYZTC dimensionOrder restriction, but otherwise generalising the model in the manner as described in the attached diff. This would make the model support NDIM in terms of being sufficiently abstract, while keeping it restricted to 5D until we were ready to relax the constraint and allow >5D. It would remove the hardcoded 5D assumptions in the model, readying it for full nD support at a future date. Note the minor impact upon units support. We would probably want to have a standard annotation (not named Modulo, but retaining the same basic features) which describes an arbitrary dimension.
comment:11 Changed 9 years ago by ajpatterson
- Owner changed from ajpatterson to rleigh
comment:12 Changed 8 years ago by jamoore
Referencing ticket #7855 has changed sprint.
comment:13 Changed 8 years ago by jamoore
Referencing ticket #7855 has changed sprint.
comment:14 Changed 8 years ago by jamoore
- Milestone changed from 5.x to Unscheduled
One important consideration for multi-dimensional data is the dimension ordering. We currently leave the modulo ordering unspecified (well, it's restricted to the parent dimension), but file formats may have different physical layouts for these extra dimensions, which will have significant performance implications if you process the data in a suboptimal order.
I would like to suggest that the existing "XYZCT" order needs extending to support "XYZzCcTt" where the lowercased variants are the modulo dimensions. Changing dimensionOrder from an enum to a string would permit this.
There is also a case for including the sizes for *all* dimensions, including the modulo dimensions, in the core model. Currently we have "SizeA" attributes for the fixed five dimensions. Maybe we really want a mapping of dimension=>size to make this flexible. This could be done by having dimensionSizes as a comma-separated list of unsigned integers to keep things in a single attribute. This would also mean that the "modulo" dimensions can be real dimensions and ZTC can retain their true values rather than being multiplied by the modulo size.
dimensionOrder="XYZCTtzc"
dimensionSizes="512,512,12,4,43,16,1,1"
Note that the unused two modulo dimensions are listed last and set to 1. They could also be omitted entirely and defaulted to 1. This could be done for any dimension which would make the simplest case:
dimensionOrder="XY"
dimensionSizes="128,128"
The above model change would permit efficient representation of modulo dimensionality in the core model. But the dimensionOrder/Sizes attributes also generalise the fixed set of dimension attributes used at present which would permit future use of higher dimensions with no model changes required:
dimensionOrder="XYZCS", where "S" is a site
dimensionOrder="XYUV", where U and V are tileX and tileY numbers
dimensionOrder="XYCWS", where W and S are well and site numbers
This would require corresponding changes to the "FirstA" and "SizeA" attributes to also aggregate them in a single attribute which can handle arbitrary dimension ordering. These would need to exclude the XY dimensions. And bioformats and other consumers would also need to be able to cope with this.
Currently X and Y are assumed to be the first two dimensions, and it's also implicitly assumed that each plane is only X and Y. If we wanted to be more flexible here, we could allow the model to specify which dimensions are represented in the planar data e.g. XY, XZ, or even Xt for line scan TCSPC data for example.
Summary: Replacing the fixed set of dimension-specific attributes with single dimension-independent lists will make supporting modulo much more flexible and transparent both in the model and in the implementations. It would also lay the groundwork for true nD support at a later stage.