Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | New York University |
| Country | United States |
| Start Date | Apr 01, 2021 |
| End Date | Mar 31, 2027 |
| Duration | 2,190 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2045590 |
Advances in sensing and storage technology have increased the ability to collect and share huge amounts of data. From satellite imagery, to genetic data, to web content, richer datasets offer the promise of improved data-driven discovery and decision making across science, engineering, and industry. Realizing this promise, however, requires enormous computational effort.
The goal of this project is to democratize the data revolution by developing new algorithms to efficiently process the world's largest datasets, without the need for the world's largest supercomputers. To do so, the investigator and his team will study a powerful algorithmic technique known as "matrix sketching". The key idea is to quickly compress a large dataset (represented as a matrix of numbers) down to its most essential information by eliminating redundancy and noise.
The compressed data can then be efficiently digested by downstream algorithms for machine learning and statistical inference. This project will advance the state-of-the-art in matrix sketching by taking an interdisciplinary approach, combining tools from theoretical computer science with methods from computational and applied mathematics. The project also involves a major educational component, aimed at improving U.S. mathematics education through closer ties with applications in STEM fields.
The project will support an international high-school applied-mathematics competition, the development of curricular material and workshops for high-school educators, and course development to better prepare university students for careers in algorithms, machine learning, and data science.
To advance research in matrix sketching, the project is centered around three main objectives, each involving problems of practical importance, as well as motivating theoretical questions that will more broadly impact algorithms research. The first objective is to develop sketching techniques that move beyond low-rank matrix compression, which only captures information about the largest-magnitude components of a matrix’s spectrum.
Motivated by emerging applications in network science, deep learning, and computational physics, the research team is developing techniques that instead capture coarse information about the entire spectrum of a matrix. The second objective is to develop methods that allow for higher accuracy by combining existing sketching algorithms with powerful tools for interactive refinement.
The goal is to design algorithms with runtimes that depend logarithmically, instead of polynomially, on problem accuracy. The final objective is to extend the impact of sketching beyond applications where data is over-abundant, by addressing important problems where sufficient, high-quality data remains a luxury. The theoretical tools of matrix sketching and data subsampling are being used to design smarter data-collection strategies for the “small-data” regime, advancing the state-of-the-art in active learning and experimental design.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
New York University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant