Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Worcester Polytechnic Institute |
| Country | United States |
| Start Date | Jun 01, 2021 |
| End Date | May 31, 2023 |
| Duration | 729 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2104528 |
When training deep-learning and other machine-learning models, one would like to train in such a way that the generalization error of the trained model, that is, the error of the trained model when it is used to make predictions on a “test” dataset which was not used to train the model, is as low as possible. Training algorithms with good generalization properties can lead to machine-learning models that are more robust to changes in the dataset, allow for robust predictions, and help mitigate algorithmic bias when the training dataset may not be fully representative of the diversity of the population dataset.
Such algorithms can also lead to more stable training in settings such as distributed training and online learning. In practice, the choice of optimization algorithm that one uses to train the model can greatly affect both its training error and generalization error. Unfortunately, there is a lack of optimization algorithms with provable guarantees on the generalization error.
This makes it difficult to design algorithms which provably achieve both a fast running time and low generalization error. The aim of this project is to design novel algorithms for training deep-learning and other machine-learning models, and to prove guarantees on the running time, generalization error and related robustness properties of these algorithms.
To design and analyze such algorithms, this project brings together ideas from different areas of mathematics and computer science.
This project is designing novel optimization and Markov-chain sampling algorithms, for training deep-learning models as well as other machine-learning models. It aims to prove guarantees on the generalization error and related robustness properties of these algorithms, and also to provide fast running-time guarantees. Guaranteeing a low generalization error is especially challenging in deep learning, since the number of trainable parameters is oftentimes much larger than the size of the dataset, and the loss function used to train the model is nonconvex.
To prove stronger generalization and related robustness guarantees, the project team uses ideas from manifold learning and differential geometry to model the low-dimensional structure of datasets which arise in many machine learning applications. The project has three components. One component is to design and analyze novel optimization algorithms for training deep learning models.
Another component is to design and analyze algorithms for multi-agent optimization problems, such as the min-max optimization problems which arise when training generative adversarial nets (GANs) as well as multi-agent optimization problems which arise when training meta-learning models. Finally, in addition to optimization algorithms, it is also designing and analyzing Markov-chain sampling algorithms and related algorithms which are used to train Bayesian machine learning models.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Worcester Polytechnic Institute
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant