Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Johns Hopkins University |
| Country | United States |
| Start Date | Oct 01, 2021 |
| End Date | Sep 30, 2025 |
| Duration | 1,460 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2127575 |
Strings are fundamental objects in many important computer science applications, such as text/speech processing, bioinformatics, data storage and access, and communication systems. However, processing strings presents various challenges in computation and communication. For example, many of today's applications involving strings are
implemented on large data sets, and the known algorithms are often too costly in both time and space. Similarly, when strings are transmitted between systems, errors like insertions and deletions occur frequently, which can lead to problems from a simple misunderstanding to a severe system failure. These challenges are usually addressed by high cost
solutions, where one either requires a large amount of resources for the algorithms, or expensive hardware that keeps systems strictly synchronized. The overarching goal of this project is to understand the fundamental question of how to more efficiently compute different string measures, and transmit strings reliably in the presence of
insertions and deletions. This will lead to a deeper theoretical understanding of the nature of various string measures and errors, as well as potentially much more efficient solutions in practice. Examples of benefits include algorithms that use much less resources for handling large data sets, and more reliable communication systems in adversarial
environments. The project also has plans for mentoring PhD students, integration of the research topics into courses taught to students from a variety of different backgrounds, and support of under-represented groups of students in computer science. The project contains two sets of specific goals. The first set of goals
seeks to understand how to design efficient error-correcting codes for insertions and deletions. These include codes and document-exchange protocols over the binary alphabet with asymptotically optimal parameters; codes that form a linear subspace or have a low density parity check matrix, which allow fast encoding and decoding; list
decodable codes that can correct a larger fraction of errors with outputting a small list of possibly correct messages; and locally decodable codes that can correctly recover any target message bit with only a few queries of the codeword. The second set of goals investigates the space complexity of various string measures, such as edit distance,
longest common subsequence, and longest increasing subsequence. Specifically, the goal is to both design algorithms that require much smaller space to compute or approximate these string measures, and to prove better space lower bounds in various models. These two sets of goals are also naturally connected, since the codes in the first part
can often be used to give space lower bounds in the second part. The study of these topics will be based on techniques from several different areas such as probability theory, information theory, combinatorics, algorithm design, communication complexity, and pseudorandomness, and will further foster the interactions among these areas towards
breakthroughs.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Johns Hopkins University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant