Active STANDARD GRANT National Science Foundation (US)

Trustworthy Reinforcement Learning for Online Decision Making

$4.5M USD

Funder	National Science Foundation (US)
Recipient Organization	Purdue University
Country	United States
Start Date	Aug 15, 2022
End Date	Jul 31, 2026
Duration	1,446 days
Number of Grantees	2
Roles	Principal Investigator; Co-Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2217440`

Grant Description

This research project will advance the frontiers of modern reinforcement learning for online decision-making problems. Reinforcement learning deals with how intelligent agents ought to take actions in an uncertain environment in order to maximize the cumulative reward. It has achieved phenomenal success in diverse business and scientific fields.

However, less attention has been paid to the trustworthy aspects of reinforcement learning. This project will investigate different aspects of trustworthy issues like robustness, fairness, causality, and explainability in important online decision-making tasks. The results of this research will benefit many different fields such as statistics, machine learning, operations research, marketing, economics, and finance.

Open-source software will be developed to provide applied researchers with cutting-edge tools. The project will recruit students, especially those from unrepresented groups, to be involved in the research and will develop new courses on statistical reinforcement learning and decision making.

This research project will focus on three interconnected trustworthy reinforcement learning methods for online decision-making problems: dynamic pricing, dynamic assortment selection, and matching in two-sided markets. Issues of robustness, fairness, causality, and explainability will be addressed in these decision-making tasks, which will advance the exploration techniques used in existing reinforcement learning algorithms.

The project will develop new theoretical tools to analyze the statistical properties of these modern reinforcement learning algorithms. Regret upper bounds and matching lower bound will be thoroughly investigated. One important goal of online decision making is to identify an optimal policy that maximizes the overall gain, based on the contextual information and historical interactions with the environment.

Due to the complex nature of such problems, there is a high demand for trustworthy tools for learning optimal personalized policy in various settings. The knowledge gained from this research will benefit learning in online auctions and other complex market design problems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Purdue University

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

Trustworthy Reinforcement Learning for Online Decision Making

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants