Loading…
Loading grant details…
| Funder | Economic and Social Research Council |
|---|---|
| Recipient Organization | University of Essex |
| Country | United Kingdom |
| Start Date | Apr 07, 2024 |
| End Date | Apr 06, 2025 |
| Duration | 364 days |
| Number of Grantees | 4 |
| Roles | Co-Investigator; Principal Investigator |
| Data Source | UKRI Gateway to Research |
| Grant ID | ES/Z502467/1 |
The growing discourse around synthetic data underscores its potential not only in addressing data challenges in a fast-paced changing landscape but for fostering innovation and accelerating advancements in data analytics and artificial intelligence. From optimising data sharing and utility (James et al., 2021), to sustaining and promoting reproducibility (Burgard et al., 2017) to mitigating disclosure (Nikolenko, 2021) synthetic data has emerged as a solution to various complexities of the data ecosystem.
The project proposes a mixed-methods approach and seeks to explore the operational, economic, and efficiency aspects of using low-fidelity synthetic data from the perspectives of data owners and Trusted Research Environments (TREs).
The essence of the challenge is in understanding the tangible and intangible costs associated with creating and sharing low-fidelity synthetic data, alongside measuring its utility and acceptance among data producers, data oweners and TREs. The broader aim of the project is to foster a nuanced understanding that could potentially catalyse a shift towards a more efficient and publicly acceptable model of synthetic data dissemination.
This project is centred around three primary goals:
1. to evaluate the comprehensive costs incurred by data owners and TREs in the creation and ongoing maintenance of low-fidelity synthetic data, including the initial production of synthetic data and subsequent costs;
2. to assess the various models of synthetic data sharing, evaluating the implications and efficiencies for data owners and TREs, covering all aspects from pre-ingest to curation procedures, metadata sharing, and data discoverability; and
3. to measure the efficiency improvements for data owners and TREs when synthetic data is available, analysing impacts on resources, secure environment usage load, and the uptake dynamics between synthetic and real datasets by researchers.
Commencing in March 2024, the project will begin with stakeholder engagement, forming an expert panel and aligning collaborative efforts with parallel projects. Following a robust literature review, the project will embark on a methodical data collection journey through a targeted survey with data creators, case studies with d and data owners and providers of synthetic data, and a focus group with TRE representatives.
The insights collected from these activities will be analysed and synthesized to draft a comprehensive report delineating the findings and sensible recommendations for scaling up the production and dissemination of low-fidelity synthetic data as applicable.
The potential applications and benefits of the proposed work are diverse. The project aims to provide a solid foundation for data owners and TREs to make informed decisions regarding synthetic data production and sharing. Furthermore, the findings could significantly influence future policy concerning data privacy thereby having a broader impact on the research community and public perception.
By fostering a deeper understanding and establishing a dialogue among key stakeholders, this project strives to bridge the existing knowledge gap and push the domain of synthetic data into a new era of informed and efficient usage. Through meticulous data collection and analysis, the project aims to unravel the intricacies of low-fidelity synthetic data, aiming to pave the way for an efficient, cost-effective, and publicly acceptable framework of synthetic data production and dissemination.
University of Essex; The University of Manchester
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant