Active CONTINUING GRANT National Science Foundation (US)

CAREER: Toward a Comprehensive Generalization Theory for Deep Learning

$5.5M USD

Funder	National Science Foundation (US)
Recipient Organization	Stanford University
Country	United States
Start Date	Mar 01, 2021
End Date	Feb 28, 2026
Duration	1,825 days
Number of Grantees	1
Roles	Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2045685`

Grant Description

The advancement of deep learning, the technique of training artificial neural networks to make predictions, has led to recent breakthroughs in many areas of artificial intelligence, such as computer vision, natural language understanding, and robotics. A major challenge in deep learning is ensuring accurate predictions on unseen scenarios. This project plans to tackle this challenge via theoretical analysis and its empirical evaluation.

The project aims to contribute to the fundamental understanding of deep learning and inform the practical advancement of deep learning, improving its reliability, efficiency, and risk management in data-hungry and risk-sensitive applications. An education plan is integrated into this project --- the investigator will develop new courses, mentor students, organize workshops, and work with high-school teachers on developing high-school AI courses.

The project aims to build a comprehensive generalization theory for deep neural networks, which covers the technical question of implicit regularization effect and the broad concepts of out-of-domain generalization and the estimation of generalization errors. This project has three major components. The first thrust is to characterize the optimizers’ implicit regularization effect for complex models.

Leveraging the theoretical insights, the investigator will make implicit regularization more explicit, stronger, and customizable to datasets to improve generalization. The second thrust is to theoretically study the out-of-domain generalization in settings with an increasing level of differences between the training and test environments by a growing level of exploitation of unlabeled data and their properties.

Finally, the PI will study estimating the generalization errors, which is crucial for quantifying the risk before deploying machine learning models in risk-sensitive applications such as healthcare.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Stanford University

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

CAREER: Toward a Comprehensive Generalization Theory for Deep Learning

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants