Completed STANDARD GRANT National Science Foundation (US)

CNS Core: Small: Offline Inference for Ultra-Efficient Memory Management

$5M USD

Funder	National Science Foundation (US)
Recipient Organization	University of California-Los Angeles
Country	United States
Start Date	Oct 01, 2021
End Date	Sep 30, 2024
Duration	1,095 days
Number of Grantees	1
Roles	Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2128653`

Grant Description

Machine learning (ML) has made its way into systems of various kinds, helping them make informed decisions at critical points. A typical approach to ML-for-systems is to perform inference online by querying a model with runtime data. Online inference incurs non-trivial overheads, imposing a tight restriction on model size and complexity.

In fact, systems that involve ML in their decision making often use very simple models (e.g., linear models) with inferior accuracy. This project develops a transformative approach to ML-for-systems - instead of doing online inference, this project advocates to train models that can predict runtime properties directly from program source code. As such, inference can be done offline and their results can be encoded and efficiently looked up during execution.

Given that inference no longer contributes to run time, the proposed approach removes the above-discussed restrictions, enabling systems to employ state-of-the-art model architectures. This project further applies offline inference to memory management tasks that are critical to cloud applications.

Modern society relies on services provided by large-scale systems. Improving the throughput and efficiency of such systems improves the service-level efficiency and scalability that human can experience in their lives. Replacing complicated and heuristics-driven decision making in today’s memory management systems with learning has a potential to dramatically reduce the cost of allocation and deallocation, which is a significant component in an application’s execution.

Traditionally, inference is performed online, restricting what models to use and how high the accuracy can reach. This project develops techniques that make ML a more appealing approach by removing these restrictions. The techniques proposed span runtime system and ML.

This interdisciplinary nature produces research that has impact in both areas. The project also makes efforts in education and diversity by incorporating research into courses and recruiting researchers from underrepresented groups.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of California-Los Angeles

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

CNS Core: Small: Offline Inference for Ultra-Efficient Memory Management

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants