Warning: Can't synchronize with repository "(default)" (/home/git/ome.git does not appear to be a Git repository.). Look in the Trac log for more information.

Changes between Version 1 and Version 3 of Ticket #6320


Ignore:
Timestamp:
08/02/11 10:06:34 (13 years ago)
Author:
jmoore
Comment:

Merging description from #6330

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #6320

    • Property Summary changed from HIC: Export dataset to project to HIC: Dataset silo layer
  • Ticket #6320 – Description

    v1 v3  
     1So far we have focused on the project silo model. The data for which was manually anonymised, extracted and transfered from the mssql servers on the NHS network at HIC to the HIC/OMERO server on the UNI network. 
     2 
     3It is going to be useful for other stories that we develop the dataset silo model as a parent layer of the project silo  within this pilot.  
     4 
     5The dataset silo is a complete anonymised mirror of the nhs datasets held on the mssql servers at HIC. The project silos are then prepared on a project-by-project basis from the dataset silo all within the HIC/OMERO architecture. This will include anonymisation, data cleaning and modelling steps.  
     6 
     7Josh, has developed a diagram ([https://www.openmicroscopy.org/site/community/minutes/minigroup/files/omero-hic-2011-jul-22.pdf/view pdf]) outlining the proposed structure and the relationship with existing tickets. 
     8 
     9Each of the clinical data files we've provided from the GoDARTS project represents a dataset. For instance the separate files for SMR, RX (prescribing), BIOCHEM, etc are all separate datasets. I think the only way we can work this in terms of the pilot project governance is to use the GoDARTS data as the real data and create a mass of fake data based on the schemas. These would then form the dataset silo, from which we can rebuild the project silo and test that users only get to work with the project data they are supposed to. 
     10 
     11There are various reasons why this is useful - simplify/automate data flows, governance and audit, end user aggregate queries (e.g. study estimates). But of particular interest is the governance and the following 2 use cases: 
     12 
     131. The audit trail should allow dataset inspection e.g. which project have used smr... In theory a data owner (the custodians, patients or caldicotts) may want to see how safe our model is or who is using 'their' data. 
     14 
     152. The embedding of risk prediction models, privacy impact assessments and disclosure controls. These are mechanisms that are being discussed in the SHIP blueprint as ways to facilitate risk based & proportionate data governance.  
     16 
     17The following points are from the Jan2011 draft of the blueprint, once the final version is available I'll update this story - although this may need pulling out into a separate story. 
     18 
     19- Assessing privacy risks is an integral component of a data controller’s responsibilities and should form a central part of their privacy policy. This process should include the identification of confidentiality, security and privacy risks of any data handling including linkages, storage and access considerations. The Information Commissioner's Office have developed a privacy impact assessment handbook [http://bit.ly/A2cga] containing guidance for carrying out risk assessments. 
     20 
     21- Appropriate disclosure control should be applied to all outputs; this should be carried out under the authority and oversight of the designated privacy officer. The Information Services Divison (ISD) of NHS Scotland have developed various documents on data protection and confidentiality [http://bit.ly/nPOMZY] and of particular importance is their protocol for disclosure control. 
     22 
     23=== Implementation === 
    124A dataset silo in OMERO should be exportable to a re-anonymized project silo. If the work for #4652 (server-side API), then this work can be implemented as subdirectories in the silo fs repository. Along with the links between input tables and output tables, metadata about any operations (see #6321) that were performed on the data should also be recorded. 
     25 

1.3.13-PRO © 2008-2011 Agilo Software all rights reserved (this page was served in: 0.19929 sec.)

We're Hiring!