Completed STANDARD GRANT National Science Foundation (US)

FMitF: Track I: Synthesis and Verification for Programmatic Reinforcement Learning

$7.5M USD

Funder	National Science Foundation (US)
Recipient Organization	Rutgers University New Brunswick
Country	United States
Start Date	Oct 01, 2021
End Date	Sep 30, 2025
Duration	1,460 days
Number of Grantees	2
Roles	Principal Investigator; Co-Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2124155`

Grant Description

Deep reinforcement learning (RL) has recently gained tremendous popularity in numerous decision-making systems. However, deploying RL techniques in safety-critical domains such as autonomous driving has drawn concerns about its trustworthiness. RL applications can behave unexpectedly because such systems are characterized by intractably large state spaces and only a small fraction of states are explored during training.

Lack of trust in a deep neural RL policy is further exacerbated by the difficulty of identifying how logical notions of safety relate to the uninterpretable policy network structure. The project's novelty is to explore domain-specific programs as a new learning representation for trustworthy RL. It allows an RL system to be interpreted, formally verified, and debugged as an ordinary programming system.

The project's impact is to shape new methodologies to design highly interpretable and verifiable RL systems that are capable of making reliable decisions.

The main contribution of the project is a programmatic RL framework for synthesizing policies that meet both formal correctness specifications and quantitative performance objectives. Firstly, the project develops program synthesis techniques to compose primitive RL policies into composite and interpretable programs to solve sophisticated RL environments.

To explore novel compositions, it applies efficient policy gradient methods to search in a continuous relaxation of the discrete space of language grammar rules. Secondly, the project reconciles programmatic policy synthesis with program verification in a correct-by-construction RL loop, constraining policy performance optimization in provably safe state space.

Moreover, this project investigates the transferability of programmatic RL by lifting the synthesis and verification framework to first-order relational environments, leveraging relational abstractions to transfer programmatic policies to unseen environments with formal guarantees.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Rutgers University New Brunswick

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

FMitF: Track I: Synthesis and Verification for Programmatic Reinforcement Learning

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants