Loading…

Loading grant details…

Completed STANDARD GRANT National Science Foundation (US)

FMitF: Track I: Synthesis and Verification for Programmatic Reinforcement Learning

$7.5M USD

Funder National Science Foundation (US)
Recipient Organization Rutgers University New Brunswick
Country United States
Start Date Oct 01, 2021
End Date Sep 30, 2025
Duration 1,460 days
Number of Grantees 2
Roles Principal Investigator; Co-Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2124155
Grant Description

Deep reinforcement learning (RL) has recently gained tremendous popularity in numerous decision-making systems. However, deploying RL techniques in safety-critical domains such as autonomous driving has drawn concerns about its trustworthiness. RL applications can behave unexpectedly because such systems are characterized by intractably large state spaces and only a small fraction of states are explored during training.

Lack of trust in a deep neural RL policy is further exacerbated by the difficulty of identifying how logical notions of safety relate to the uninterpretable policy network structure. The project's novelty is to explore domain-specific programs as a new learning representation for trustworthy RL. It allows an RL system to be interpreted, formally verified, and debugged as an ordinary programming system.

The project's impact is to shape new methodologies to design highly interpretable and verifiable RL systems that are capable of making reliable decisions.

The main contribution of the project is a programmatic RL framework for synthesizing policies that meet both formal correctness specifications and quantitative performance objectives. Firstly, the project develops program synthesis techniques to compose primitive RL policies into composite and interpretable programs to solve sophisticated RL environments.

To explore novel compositions, it applies efficient policy gradient methods to search in a continuous relaxation of the discrete space of language grammar rules. Secondly, the project reconciles programmatic policy synthesis with program verification in a correct-by-construction RL loop, constraining policy performance optimization in provably safe state space.

Moreover, this project investigates the transferability of programmatic RL by lifting the synthesis and verification framework to first-order relational environments, leveraging relational abstractions to transfer programmatic policies to unseen environments with formal guarantees.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Rutgers University New Brunswick

Advertisement
Discover thousands of grant opportunities
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant