UK TRE Annual Conference - September 2024#
Accelerating secure digital science through community collaboration#
- Date:
- Time:
09:30 - 17:00
- Registration:
- Location:
In person at RSECon24 and online
Background#
Trusted research environments and datasets are fundamental to digital science, but work is often delayed due to the time and effort required to set up a secure project, get access to the right approved data, and to sign legal agreements.
There is enormous value locked up because of this: new research ideas are blocked by the hurdles of getting started and accessing data, and typical fixed term project funding means significant research time is lost.
We wanted to discuss what is stopping us from fixing these three critical issues: getting data federated to approved projects; instant provisioning of infrastructure for collaborative projects; and streamlining legal discussions with standard accepted T&Cs and operating models.
The UK TRE community has already setup several working groups to look at some of these problems, and had planned to submit a UKRI Digital Research Technical Professional Skills NetworkPlus grant to create a long-term footing for the UK TRE community, and to support the community in joining or creating working groups to solve the critical issues which are hampering secure digital research.
Agenda#
10:15 - 10:30 |
Welcome and introductions |
10:30 - 11:00 |
Community Management updates and announcements: Bid for funding |
11:00 - 12:15 |
Federation: Accelerating digital science Panel
|
12:15 - 12:55 |
Accelerating digital science: Focused discussion breakouts
|
13:00 - 14:00 |
Lunch and networking |
14:00 - 15:00 |
Breakout session 2: Community breakouts
|
15:00 - 15:15 |
Coffee break |
15:15 - 16:15 |
Lightning talks |
16:15 - 16:45 |
DARE UK Phase 2 presentation |
16:45 - 17:00 |
Wrap up |
Presentations#
- Video recording:
- Slides from Keynotes:
- Slides from lightning talks:
Summary of the day#
Community management updates and announcements: bid for funding#
UK TRE community co-chairs David Sarmiento and Simon Li set the context and background of the UK TRE community and its growth over the past couple of years. Some of the key points covered were:
Emphasis on the grassroots nature of the UK Trusted Research Environments (TRE) community, which relies on volunteers and is open to more co-chairs and informal help.
TRE community expansion: originally coming out of the research software engineering community, membership has expanded to include data management, information governance, and funding stakeholders. The community has grown significantly, from an initial 30 members to over 300, with over 100 organizations involved. A number of Working Groups have also been established
There is strong emphasis on flexibility and adapting to community needs and feedback.
Governance and Community Participation: Emphasis on “lazy consensus”, where community members are encouraged to take the initiative unless there is opposition. Governance charters and working groups are in place, and transparency is a core principle.The group has formalized governance processes to prepare for growth but retains a focus on community-driven initiatives.
Key community priorities identified in the introductory section#
Federation: probably the most critical topic identified for the day - given we want a distributed network, how can we enable analysis on data held across different TREs?
Data Quality: Desire for better quality and more standardized data.
AI Integration: Growing interest in leveraging AI in TRE environments.
Public and Patient Involvement: Important for UK TREs, with recognition that the UK generally performs well in this area. The role of public participation in TRE initiatives was noted as essential but missing from current discussions. A call to engage with PPIE more fully in future projects.
Funding Challenges: Discussion on how to fund working groups and ongoing community work, as most contributions are currently voluntary. There was a call for help in preparing a bid for a significant funding opportunity (up to £2 million over four years), which could support governance, digital spaces, and working groups.
Call to Action: Attendees were encouraged to join various working groups, contribute to ongoing governance processes, and help with the upcoming funding bid.
Keynote 1: James Fleming, CIO, Francis Crick Institute#
Importance of Data Federation:
Emphasis on the need to federate data across research and healthcare systems to drive a shift from reactive to proactive, health management-driven care. Federation is essential for advancing precision medicine, holistic diagnostics, and personalized treatments.
Current Challenges:
The data landscape is highly fragmented, with many thousands of data sources and hundreds of Trusted Research Environments (TREs), leading to overwhelming complexity.
Issues related to policy, strategy, and funding complicate attempts to streamline data sharing across healthcare and research sectors.
Simplification and Integration:
Vision to dramatically simplify the infrastructure, aiming to reduce the number of systems and create common platforms that enable seamless data sharing and governance across the continuum of research, clinical trials, and healthcare.
Research is increasingly multimodal and multiscale, requiring the integration of various data types (e.g., genomics, imaging, patient data) from different sources.
Federated Data Fabric:
Proposal for a common data fabric that balances control by the data provider and flexibility for researchers. This would allow organizations to manage compute/storage costs while enabling data aggregation and use for research.
Citizen-Centric Approach:
An important theme was the need to involve the citizen in controlling their data. The proposal suggests creating a “data account” that allows patients to see who has access to their data and manage consent for studies dynamically.
Call for Collaboration:
The initiative seeks collaboration from other organizations to build and support this federated data model, emphasizing shared governance and data use for both public health and research benefits.
Keynote 2: Darren Bell, Director of Technical Services, UK Data Service#
Federation as a solution: Similiar to the first keynote, highlighted the fragmentation of data repositories, which leads to underutilization of available data. Federation involves creating shared rules and standards to unify various repositories and infrastructures across domains and countries, enabling better interoperability.
Importance of data standards: There was a particular emphasis about the (sometimes overlooked) importance of data standards. Interoperability depends on the adoption of common standards for data formats, metadata, and workflows. Without standardization, especially across organizations, federated analysis isn’t possible.
Challenges with current practices: Many repositories are currently small, isolated, and follow their own data classification systems. This lack of consistency creates inefficiencies for researchers trying to access and link data across platforms. There was a call for consolidation and stronger governance to enforce common standards.
Researcher-centric focus: A major focus was on improving user experience for researchers, and not simply building infrastructure for its own sake. He envisioned a future where researchers could easily find, access, and use data through an intuitive, federated infrastructure, much like booking a flight online. However, this requires better metadata, particularly for data curation and privacy management.
Incentives and enforcement: There was a call for clear incentives to encourage organizations to adopt federated standards, along with disincentives for non-compliance. Standardization won’t happen purely through consensus but will need to be enforced at a policy level, potentially by bodies like UKRI.
Long-term vision: Bell proposed a vision where all sensitive data could eventually be managed through trusted research environments (TREs), but acknowledged this is likely a decade or more away. In the interim, he stressed the need for better automated tools for data classification, risk modeling, and privacy engineering.
Overall, federation was presented as a necessary evolution in research infrastructure, and should be driven by researcher-needs.
Panel discussion#
A panel made up of Peter McCallum (Chief Technical Officer, Elixir), Emily Jefferson (CTO, HDR UK and Interim Director, DARE UK), Darren Bell (Director of Technical Services, UK Data Service) and James Fleming (CIO, Francis Crick Institute) came together for a discussion on federation, standards, governance, and citizen involvement in controlling the use of their data.
Federation#
Federation was described as collaboration with shared rules and mutual benefits, but also obligations. There was consensus that a clear, unified definition is lacking, which poses a barrier to effective data federation.
Federation requires both technological and social solutions, standards, and strategy considerations.
Avoiding Single Points of Failure: Centralizing all data into one TRE is not feasible or desirable due to innovation challenges and data custodians’ reluctance. Instead, a federated approach with multiple specialized TREs across domains and regions is generally preferred.
Challenges of Standards#
There are many fragmented standards across different sectors (e.g., healthcare, biomedical data), leading to inconsistent data entry and quality. Harmonizing these standards is critical for successful federation.
A balance between flexibility for individual organizations and shared constraints for mutual benefit is needed.
European Perspective#
The European model, such as the European Open Science Cloud, emphasizes setting common standards without full governmental control, highlighting an alternative approach to federation through treaties or de facto standards.
Data Controllers and Governance#
A tension exists between data controllers’ individual requirements and the need for unified federation. In the future, models that engage individuals (citizen agency) will likely play a more significant role in data governance, moving beyond current legislation-based control.
Balancing transparency about data risks with the needs of research is crucial. Tools for clearer empirical risk assessments would help reassure data owners, facilitating more open data sharing.
Incentives#
Research councils should play an active role in encouraging federation by funding and setting enforceable standards. There is some reluctance to take on this role, but it is crucial for driving progress.
Citizen Involvement#
There is a growing need to involve patients and the public in decision-making about their data use. Current consent models (e.g., all-or-nothing) are blunt, and future systems should offer more granular control to individuals, enabling ongoing conversations about data use. There was some disagreement on the panel about the feasibility of granular citizen control given the number of projects that could potentially be using data about a given individual.
Overall feeling#
The overall sense in the room was that the UK TRE community has matured and discussions on federation have moved on since the Swansea meeting in 2023 to a position where organisations are really starting to talk to one another to make this happen.
Breakout session 1: Accelerating digital science: Focused discussion breakouts#
Breakout session 2: Working Groups#
Each Working Group ran a breakout discussion for community input into how they are contributing to our common challenges :
SATRE: An accreditation and specification for generic SDE/TRE facilities is needed. How these adapt to cover the different parts of a digital science platform remains a central need, helping the federation of datasets to projects.
SDE/TRE terminology: Developing the community language for the digital science platform will make sure the parts join up and projects can be provisioned.
Extending Control: How the technology works with and enables information governance. So helping the federation of datasets to projects.
Cybersecurity risks: A base capability for any digital science platform, and how these risks change with new approaches and tools (e.g. AI/ML) is an ever emerging factor.
Citizen Agency: Digital science is all about collaboration with people. Failure will result unless it is ensured that they remain connected to the platform and support science that is consent based as well as population.
Glossary: Focused on developing a shared lexicon for TREs to support interoperability and federation.
Lightning talks#
DARE UK Phase 2 presentation (Fergus McDonald, Deputy Director DARE UK; Emily Jefferson, Interim Director DARE UK)#
DARE UK Phase 2 is an £18.2m investment from UKRI over 2.5 years (reduced from the figure of £20.6m presented - reflecting adjustments in spending allocation for this financial year) that aims to revolutionize the use of cross-domain sensitive data in research by enhancing TRE capabilities, while maintaining public trust and benefiting researchers.
Core Activities#
Transformational Programs: Phase 2 focuses on advancing the capabilities developed in Phase 1, working with early adopters to test, configure, and implement these solutions in real-world settings. The transformational programs focus on:
Automation and AI capabilities in TREs: Developing new capabilities for secure AI model training and semi-automated disclosure control to reduce manual processing and accelerate research in TREs
Reference TRE standards and implementations
Federated analytics (both remote query and ‘single-pane-of-glass’ data view)
Next-generation proof-of-concepts: Similar to Phase 1 Sprint Exemplar projects, this workstream will provide funding to build prototypes of next-generation TRE capabilities
Early testing of a national network of TREs: Technology evaluation and feasibility test implementations for creating a connected national network of TREs
Community Engagement and standards: Phase 2 includes funding for community building, supporting information sharing, consensus building, and collaboration across domains.