Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Southern California |
| Country | United States |
| Start Date | Sep 15, 2021 |
| End Date | Aug 31, 2025 |
| Duration | 1,446 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2128661 |
We are witnessing the age of data explosion. Large amounts of data are continuously generated by people, sensors and computers and then stored in computer systems. Many modern applications require these systems answer questions about the data quickly, accurately and with low storage costs.
This is a challenging task, because increasing speed comes at the expense of accuracy, and there is a trade-off between the three goals (speed, storage overhead and accuracy). The traditional approach to address these issues was to develop customized computer algorithms, each designed to answer a specific type of question (called query type), by striking a compromise between these goals.
Such a process has proven to be time consuming and needs to be repeated for each query type. The ultimate goal is to design a system that can automatically search among possible algorithms to find the ones with the best time/space/accuracy trade-offs for each query type. This project takes a first step in this direction by utilizing Artificial Intelligence (AI) to replace all customized algorithms for different query types with a single neural network system (similar to network of neurons in human brain) that can answer any query type after seeing enough examples of questions and answers.
Such an approach will significantly reduce (or eliminate) human intervention in designing/choosing algorithms for different query types, while improving upon the time/space/accuracy trade-offs for answering them. This will benefit a broad range of application domains, especially given the data-driven nature of most of today's applications. For example, this approach will greatly benefit data-driven areas, such as smart-cities, transportation, energy, environment and health.
Finally, this project will also be extremely attractive to diverse students (both undergraduates and graduates) who are these days more attracted to AI, and will recruit them to contribute to the field of data management.
The main thesis of this project is that queries can be represented by functions which can be approximated. Thus, a generic neural network framework will be developed, dubbed NeuroDB, that can learn a model to approximate the query function. The learned model can then be used to find approximate answers to any type of query.
The initial focus of NeuroDB will be on distance-to-nearest-neighbor queries and range aggregate queries, two important building blocks of many real-world applications. Preliminary results show that for these two query types, NeuroDB is orders of magnitude faster compared to state-of-the-art hand-crafted algorithms, and only uses a fraction of the data size for storage, while providing similar accuracy (the model concisely learns query and data distributions).
NeuroDB will then be extended to a generic framework that can answer a broader set of query types, thus replacing different database operations with neural networks. The challenges in doing so are twofold. First, designing a neural network even for answering a specific query type is challenging.
It requires a deeper understanding, both theoretically and empirically, of the approximation power of neural networks, and how it changes with network architecture and training process, for the specific query type. Second, the observations about specific query types need to be generalized to design a framework that can answer other query types, and a learning methodology is needed that is able to train accurate models for answering them.
NeuroDB will be developed as an open-source software system to be used and extended by the research community.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Southern California
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant