Loading…

Loading grant details…

Active CONTINUING GRANT National Science Foundation (US)

CAREER: From Analysis to Practice: Landscape-driven Optimization Algorithms for Deep Learning

$5.33M USD

Funder National Science Foundation (US)
Recipient Organization New York University
Country United States
Start Date Mar 15, 2021
End Date Feb 28, 2026
Duration 1,811 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2041872
Grant Description

Deep learning (DL) is a major driving force of tech industry, where it is used for a plethora of problems such as image, speech, and video recognition, image segmentation, and natural language processing. DL is also increasingly more often used in physics, medicine, and chemistry, among other disciplines. Training a DL model in any of these applications requires solving a mathematical problem whose properties are poorly understood.

Consequently, existing DL training methodologies are sub-optimal and consume a large amount of resources, time, and money. Our limited understanding of DL compromises the progress of all public and private sectors that rely on DL technology, and limits its deployment in new applications. This project aims at overcoming this limitation by describing universal properties of DL systems that hold across a variety of DL models and data sets.

The acquired knowledge will be used to develop a new generation of training strategies that are tailored to the DL setting and are efficient, accurate, and scalable. New algorithmic tools will have a strong impact on a wide range of applications and can be leveraged by US public and private entities to shift to significantly more powerful computational learning platforms in various areas of their AI-based businesses that require the processing of large and complex data.

Broader impact activities of this project include (a) graduate and undergraduate curriculum development, (b) summer research opportunities for high-school students via NYU Applied Research Innovations in Science and Engineering program, and (c) knowledge popularization via NYU Tandon ECE Seminar Series on Modern AI, organized by the investigator, that is open to universities, high schools, and industry, and that is streamed worldwide.

The proposed research is a multi-level approach to explore the principles of DL optimization and generalization, and to develop new generation DL optimization tools. First, the researchers will seek to understand the relationship between the geometric properties of the non-convex DL loss landscape and the generalization abilities of DL models. Next, the researchers will characterize the training trajectories of common DL optimizers.

These studies will be essential for developing landscape-aware DL optimizers. The obtained optimizers will be parallelized in order to be compatible with the architecture of the computer clusters that are typically used to train large-scale DL networks on massive data. The new parallel optimizers will accommodate dynamic allocation of computational resources during training and will be able to process extremely large data batches.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

New York University

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant