Project discussion: Would a systematic approach to data risk classification be helpful?#
Chair: Will Crocombe (RISG Consulting)
Prompts#
Risk - how much and what sort? Personal/sensitive, commercial, political, IP…
Why classify risk and how might it help?
What type and level of controls might be practicable and proportionate?
Notes#
Could there be a common language around risk?
Classify based on the ease of identifiability, plus ‘payload’ - what would we know about them?
Proportionate controls - Tiered. Gatekeepers and access points (where). E.g.
0 - public
1 - anonymised
2 - strong pseudo
3 - weak pseudo
4 - public
Dropping down tiers, things become easier. Turing paper on this - Sheffield used this as the basis of their system for assessing risk.
Importance of agreed risk classification with federation, and agreement on risk appetite
Doing this work at King’s similar classification to Turin paper
Dundee operate on a blanket tier
My question was going to be around risk classification, based on my understanding of Goldacre, pseudonymisation should not be relied on. I agree researchers should only be presented data required for their project, but the risk of de-anonymisation particularly when combining datasets means this should be treated cautiously at best.
Automation - reduces risk of error