Loading…

Loading grant details…

Active STANDARD GRANT National Science Foundation (US)

Collaborative Research: CIF: Small: Inverse Reinforcement Learning with Heterogeneous Data: Estimation Algorithms with Finite Time and Sample Guarantees

$3M USD

Funder National Science Foundation (US)
Recipient Organization University of Minnesota-Twin Cities
Country United States
Start Date Dec 01, 2024
End Date Nov 30, 2027
Duration 1,094 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2414372
Grant Description

Learning a structural model of dynamic decision-making helps us better understand and predict how agents, whether human or machine, make decisions over time in changing environments. Instead of just copying actions, this approach allows us to capture both the agent’s goals (preferences) and how it understands the world (environment dynamics). This provides a much deeper insight into behavior, enabling predictions about how the agent would act in new or unseen situations.

Such models are valuable because they can help improve decision-making systems, allowing them to adapt and make reliable choices in complex real-world scenarios, such as personalized AI assistants, autonomous systems, or decision support tools. There is an urgent need for models and algorithms that can create such structural frameworks. The outcomes of this project will have broad applications, including areas like control systems, natural language processing, and autonomous driving.

Moreover, these efforts offer valuable opportunities to enhance the optimization and reinforcement learning curriculum, engaging students from diverse backgrounds in cross-disciplinary research and K12 outreach initiatives.

This project develops machine learning models of an agent’s dynamic decisions subject to structural constraints on observed behavior. Specifically, the agent’s observed behavior (data) is modeled as being consistent with the inter-temporal optimization of a reward function (preferences) given a representation of how the environment evolves pursuant to control actions (dynamics).

Unlike behavioral cloning models, a structural model of observed control behavior is a solid basis to perform counterfactual analysis and/or transfer learning. However, developing structural models of control is computationally challenging and the statistical properties of structural estimators are not easy to characterize. This project aims to advance the state-of-the-art on methodologies for learning structural models of control, by considering a diverse set of data (including demonstration and preferences), and by considering both online and offline settings.

Finally, extensive experiments will be conducted to evaluate and apply the proposed methodologies in aligning large language models (LLMs), and in autonomous driving.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of Minnesota-Twin Cities

Advertisement
Discover thousands of grant opportunities
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant