Loading…

Loading grant details…

Completed STANDARD GRANT National Science Foundation (US)

NSF-BSF: CIF: Small: From storage codes to recoverable systems

$5M USD

Funder National Science Foundation (US)
Recipient Organization University of Maryland, College Park
Country United States
Start Date Jul 01, 2021
End Date Jun 30, 2024
Duration 1,095 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2110113
Grant Description

Modern-day data centers store large volumes of information in distributed form, placing parts of the same data file on different servers in the system wherein server failures occur on a regular basis. Recovery of data located on servers that become unavailable because of transient or permanent failures critically depends on the methods of data encoding, and developing such methods is the main directions in the design of large-scale storage systems.

This project aims at constructing methods of data encoding that account for low-cost data recovery based on the nature of connections between the servers such as spatial proximity or the availability of communication links. This represents a shift from the broadly studied problems of data reconstruction that discount the varying cost of moving data between the servers based on the topology of the system, and opens a possibility of engaging new mathematical methods for the design of efficient methods of data encoding and reconstruction.

In the first part, this project advances high-density storage systems based on recently discovered applications of tools from computer science and applied mathematics to the code design. In the second part, the project addresses large-size storage systems, aiming to establish new statistical properties of methods of data encoding as well as the limits on the volume of data that can be stored in the system while maintaining the recovery functionality.

This project aims at developing new methods in the problems of data coding for large-scale distributed storage. Since different parts of the codeword comprising a chunk of data are stored on different servers, efficient recovery hinges on the ability to reconstruct unavailable data from the servers that are in close proximity of the failed storage node.

In technical terms, the system is described by a graph where coordinates of the codeword are stored on different vertices, and the value of each vertex is a function of the values of its neighbors in the graph. In the first part, this project aims at constructing efficient encoding methods based on recently established connections between codes for storage, index codes, and low-density quantum codes.

The main goals are constructing codes of the highest possible rate as well as developing iterative procedures for the recovery of multiple nodes for various classes of graphs relying on their algebraic properties and on methods from percolation theory on finite graphs. The second part of the project is concerned with coding systems that support data recovery on infinite graphs such as the graph of integers, the two-dimensional integer lattice and the like, using methods from constrained systems, symbolic dynamics, and entropy theory for the purpose of establishing the maximum possible density of data stored in large-scale systems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of Maryland, College Park

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant