Loading…

Loading grant details…

Active STANDARD GRANT National Science Foundation (US)

CSR: Small: Agile, Fine-Grained, General Function Serving for Dynamic Real-Time Deployment

$6M USD

Funder National Science Foundation (US)
Recipient Organization Georgia Tech Research Corporation
Country United States
Start Date Oct 01, 2024
End Date Sep 30, 2027
Duration 1,094 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2420977
Grant Description

This project aims to develop software systems for robust and efficient support of interactive Artificial Intelligence-enabled services. An increasing number of applications rely on pre-trained Machine Learning models for their operation, such as computer vision tasks, natural language processing, and recommendation systems. Cyber-physical systems (e.g., autonomous navigation) also operate in a tight control loop of sensing the environment, perception, and performing corresponding actions.These applications collectively can be characterized by requiring Machine Learning models to be both accurate and able to respond within strict deadlines.

Such responsiveness is mission-critical, as a late response can bear undesirable consequences, no matter how accurate. And yet, these applications are deployed in increasingly heterogeneous and unpredictable environments. This project’s novelties lie in building software frameworks that are able to navigate this accuracy/responsiveness tradeoff space automatically with high efficiency in real time.

Its broader significance and importance lies in enabling robust AI-powered applications to withstand unpredictable deployment conditions in domains spanning healthcare, cyber-physical systems, online recommendation systems, and AI-assisted search and rescue operations.

To address these challenges, this project develops agile mechanisms and policies for serving a family of models instead of a fixed model for a single prediction task. Its ability to activate different models near-instantaneously in-situ enables operating across heterogeneous devices without retraining the neural network for the served model. This saves on training cost – amortized over multiple heterogeneous devices and per-device dynamic deployment conditions (e.g., latency budget variation, processor frequency scaling, battery level, etc.) These unpredictable sources of variability are supported by the novel combination of this project’s mechanism and policy that schedules a stream of ML prediction tasks.

The project generalizes inference serving, decoupling ML serving mechanisms from the policies that control query scheduling decisions. This project aims to produce system software artifacts that will be open sourced for the general public. Research outcomes from this project will be incorporated in several courses at both graduate and undergraduate level, collectively attracting more than 165 students annually.

This is in addition to the prompt dissemination of research outcomes to the public through conference proceedings, seminar talks, and invited keynotes.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Georgia Tech Research Corporation

Advertisement
Discover thousands of grant opportunities
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant