Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Illinois At Urbana-Champaign |
| Country | United States |
| Start Date | Aug 01, 2021 |
| End Date | Jul 31, 2024 |
| Duration | 1,095 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2103778 |
Modern laboratories provide unprecedented sensitivity to the many different galactic-messengers that stream through our planet by the minute: cosmic rays, light from distant galaxies, elusive neutrinos, and possibly dark matter. Combining this information with models and data from simulations provides insight into how our universe began and continues to evolve -- the scales at which objects first collapsed, the development of stars and galaxies, and the dynamics within our own galaxy.
However, this data is often inaccessible: scientists within an experiment or community struggle with the complex, custom-built programs they use to access the data. And switching to a standard format is usually not an option: these data formats are designed for requirements that often do not include cross-experiment synthesis or linking.
Junior scientists - let alone the public - can struggle to generate new insights from the data because the data is difficult to access, understand and analyze. The cross-cutting inquiry that could arise from clever reuse and combination of data from different experiments and simulations is rarely conducted.
This project makes data accessible both within and across collaborations, providing the infrastructure to search for signals in detectors across the globe. Extending existing efforts to improve data access makes this project possible: yt is software that provides uniform access to simulation data; Kaitai is a data-description language that enables easy access to any data format; Rucio and other tools provide a standard interface that allows data downloads; and ServiceX can identify, subset and process data with little effort from the end user.
Scientists have built experiments that offer an incredible wealth of information about our world. This project works to make that information accessible to everyone. Technical Description
The Personal Data-Delivery infrastructure (PONDD) addresses the data challenges of existing dark matter and astrophysics experiments while requiring no changes to existing data formats. This non-invasive, no-changes-necessary support for any file format provides opportunities to expand beyond our two identified use cases, dark matter searches and astrophysics simulations, into many other data-driven science domains that rely on custom file formats.
This work delivers an infrastructure that seamlessly delivers data in a well-supported format (such as Parquet) from multiple sources. To successfully deliver cross-experiment data to end users, we bring together ongoing projects from High Energy Physics and the broader NSF community; while this project will involve development of software products (yt and Kaitai) it will also include synthesis of existing investments in cyberinfrastructure and efforts to improve their long-term sustainability.
This project is supported by the Office of Advanced Infrastructure in the Directorate for Computer and Information Science and Engineering and the Division of Physics in the Directorate for Mathematical and Physical Sciences.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Illinois At Urbana-Champaign
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant