Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Rochester |
| Country | United States |
| Start Date | Oct 01, 2021 |
| End Date | Sep 30, 2025 |
| Duration | 1,460 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2107050 |
Motivated by societal trends that value institutional openness and transparency, open data is being produced and shared at a speed that surpasses our ability to process it. Many governmental and private institutions are adopting Open Data Principles that state that the shared data is complete, accurate, and timely. These properties make this data of great value to data scientists, journalists, and the public.
When Open Data is used effectively, data scientists can explore and analyze open resources, which in turn allows them to investigate public policy, create new scientific knowledge, and discover new (hidden) value useful for social, scientific, or economic initiatives. Though the open data movement has succeeded in its ambition of making data accessible, it has not succeed in making this valuable data easy to use. The overarching goal of this project is to address this shortcoming.
In this project, we present a vision for Open Data Curation - data curation that is open, transparent, and explainable. Open Data Curation uses an on-demand integration paradigm that spans data discovery, data cleaning and linking, and data integration. Our vision is to enable users to query heterogeneous data stored in a data repository with minimal up-front effort.
Users can reference concepts and attributes in their queries that do not exist in the data. An on-demand integration system (ODIS) responds to such requests by automatically determining what data could be transformed and integrated to provide data for a requested concept. In terms of societal impact, the project will provide the algorithmic innovations to make effective, intuitive on-demand integration over open data lakes a reality.
Our solutions will use real open data and will be robust to the sometimes quirky, and always diverse, characteristics of open data. We believe a profound shift in how people think about data integration and curation is needed to fuel the data science revolution which is being held back by incoherent data curation - a task that is still considered one of the most time consuming, annoying, and error-prone in data science.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Rochester
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant