Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Tulane University |
| Country | United States |
| Start Date | Oct 01, 2024 |
| End Date | Nov 30, 2025 |
| Duration | 425 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2512858 |
In today’s software-centric world, ultra-large-scale software repositories, e.g. GitHub, with hundreds of thousands of projects each, are the new library of Alexandria. They contain an enormous corpus of software and information about software.
Scientists and engineers alike are interested in analyzing this wealth of information both for curiosity as well as for testing important research hypotheses. However, the current barrier to entry is prohibitive and only a few with well-established infrastructure and deep expertise can attempt such ultra-large-scale analysis. Necessary expertise includes: programmatically accessing version control systems, data storage and retrieval, data mining, and parallelization.
The need to have expertise in these four different areas significantly increases the cost of scientific research that attempts to answer research questions involving ultra-large-scale software repositories. As a result, experiments are often not replicable, and reusability of experimental infrastructure low. Furthermore, data associated and produced by such experiments is often lost and becomes inaccessible and obsolete, because there is no systematic curation.
Last but not least, building analysis infrastructure to process ultra-large-scale data efficiently can be very hard.
This project will continue to enhance the CISE research infrastructure called Boa to aid and assist with such research. This next version of Boa will be called Boa 2.0 and it will continue to be globally disseminated. The project will further develop the programming language also called Boa, that can hide the details of programmatically accessing version control systems, data storage and retrieval, data mining, and parallelization from the scientists and engineers and allow them to focus on the program logic.
The project will also enhance the data mining infrastructure for Boa, and a BIGDATA repository containing millions of open source project for analyzing ultra-large-scale software repositories to help with such experiments. The project will integrate Boa 2.0 with the Center for Open Science Open Science Framework (OSF) to improve reproducibility and with the national computing resource XSEDE to improve scalability.
The broader impacts of Boa 2.0 stem from its potential to enable developers, designers and researchers to build intuitive, multi-modal, user-centric, scientific applications that can aid and enable scientific research on individual, social, legal, policy, and technical aspects of open source software development. This advance will primarily be achieved by significantly lowering the barrier to entry and thus enabling a larger and more ambitious line of data-intensive scientific discovery in this area.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Tulane University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant