Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Michigan State University |
| Country | United States |
| Start Date | Sep 01, 2021 |
| End Date | Aug 31, 2026 |
| Duration | 1,825 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2106472 |
Many data processing tasks, such as image, video, and music compression and classification, involve finding compact representations of data files. Compactly representing such files is generally a good idea for many practical reasons. Compressed music and image files (MP3, JPG, etc.) are much faster and cheaper to communicate and store than the originals.
In classification applications, a small number of informative file features are often selected from larger categories of data in order to help boost accuracy and efficiency (this is akin to only focusing on the voice in a song when the aim is to identify the singer). In more extreme situations, data signals may be so large or change so rapidly that they cannot be stored or analyzed at all without first being quickly compressed.
Many interesting problems of this type exist in research areas related to algorithms for internet data analysis aimed at, for example, quickly detecting particular types of large-scale cyber-attacks. As part of this project, the investigators will develop and implement new faster compression and data analysis techniques for complex data, which can then be used to facilitate faster data processing in a myriad of large-scale data processing applications.
The project will also have educational benefits aimed at increasing the representation of students from under-represented and under-served groups in STEM research fields. This will be accomplished by the investigators hosting and mentoring research projects for undergraduate students from diverse backgrounds who will apply the compression and data analysis techniques developed as part of this research to specific application data, for example, to analyze and better understand Lyme disease data.
This research includes a rich new class of practical Johnson-Lindenstrauss (JL) maps for vector data that cannot only be applied to vectors faster than Fast Fourier Transform time serially but are also trivially parallelizable. The embeddings will be randomized, and their analysis will be supported by the development of novel concentration inequalities based on generic chaining and supremum of chaos approaches for structured tensor data embeddings.
These techniques will then allow, for example, the construction of new fast and memory efficient embeddings with the Tensor Restricted Isometry Property of value in the analysis of large tensor data. In addition, the research will develop new nonlinear bi-Lipchitz extensions of linear modewise JL-maps for tensor data capable of preserving distances between all low rank tensors in a given database and all other lower rank tensors, even outside of the database.
These new nonlinear embeddings techniques will allow improved theoretical guarantees for space-constrained learning and classification with polynomial kernels. Finally, these embeddings will also be applied to address data-intensive problems in quantum many-body theory and nuclear physics.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Michigan State University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant