Loading…
Loading grant details…
| Funder | Engineering and Physical Sciences Research Council |
|---|---|
| Recipient Organization | University of Oxford |
| Country | United Kingdom |
| Start Date | Sep 30, 2024 |
| End Date | Mar 30, 2028 |
| Duration | 1,277 days |
| Number of Grantees | 2 |
| Roles | Student; Supervisor |
| Data Source | UKRI Gateway to Research |
| Grant ID | 2922634 |
Autonomous systems are deployed globally in a wide range of applications, from assistive robotic
systems in healthcare through to automated instrumentation and control in process plants. The ever-increasing deluge of data and advancement of machine learning methods have led to such autonomous systems becoming more pervasive within society. As such, the need to design autonomous methods that can perform complex tasks in dynamic environments in a safe and controlled manner are of growing
importance. Reinforcement Learning (RL) is the main paradigm by which autonomous systems learn and operate. Within this paradigm, environments are mathematically modelled as Markov Decision Processes (MDPs) and RL algorithms act to synthesize policies that perform sequential decision making within those MDPs.
In more complex systems, a Partially Observable MDP (POMDP) is often used as the mathematical model for policy synthesis. This project aims to advance the field of safe RL in the context of control architectures for environments that exhibit complex and stochastic dynamics. Namely, this encompasses the formulation of policies that adhere to a set of restrictions over the duration of the
decision-making process, such that autonomous agents are able to behave safely whilst performing their duties. In particular, this project aims to answer the following research questions: 1. What is the impact of applying formal methods to Bayesian RL algorithms on agent performance in POMDPs? 2. Can probabilistic model verification be integrated into the architecture, to enable automatic
verification of whether the constraints have been satisfied? 3. Is it possible to apply such a framework to continuous state and/or continuous action environments which better represent real-world systems? Finding answers to these research questions should lead to the generation of ideas and models that can
be applied across a diverse range of applications to improve the feasibility and applicability of safe
autonomous agents in the real world. The novelty in the project stems from the integration of formal methods and Bayesian methods to the RL paradigm. Research in the application of Bayesian methods to RL have demonstrated that this is a powerful approach to improving agent performance in POMDPs. The integration of formal methods would
enable the restriction of synthesized policies to obey pre-specified constraints defined by temporal logic. Finally, the novel incorporation of probabilistic model verification into the architecture would facilitate the automatic checking that the synthesized policies do indeed follow the constraints described in a given
specification. This would result in a novel algorithmic framework from which autonomous agents can be trained to perform a given task safely, irrespective of the environment or task. This project falls within the following EPSRC research areas: Artificial Intelligence Technologies, Theoretical Computer Science, and Verification and Correctness. The project falls within the scope of the
"Artificial Intelligence Technologies" research area as it aims to develop new methodologies for RL, one of the fundamental machine learning paradigms for automated reasoning and planning. Further, the project
falls within the scope of the "Theoretical Computer Science" and "Verification and Correctness" research areas as it aims to integrate formal methods (linear temporal logic in particular) and probabilistic model
verification to establish a training framework that enforces the adherence of synthesized policies to prespecified constraints. This project is supported by industrial collaboration with Airbus, who are interested in the application of safe RL in the context of space and other aeronautical domains.
University of Oxford
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant