Loading…
Loading grant details…
| Funder | Engineering and Physical Sciences Research Council |
|---|---|
| Recipient Organization | Heriot-Watt University |
| Country | United Kingdom |
| Start Date | Sep 10, 2023 |
| End Date | Aug 30, 2027 |
| Duration | 1,450 days |
| Number of Grantees | 2 |
| Roles | Student; Supervisor |
| Data Source | UKRI Gateway to Research |
| Grant ID | 2934195 |
A significant challenge in robotics is the creation of autonomous agents which can generalise to new environments and situations. This is most readily identified in the reality gap, a longstanding problem in robotics whereby learning algorithms trained in simulations may not perform as expected when deployed
in the real world [1]. There are several approaches to addressing this gap, including randomizing the environment and agent, combining real and simulated data for training, using model priors, imitation learning, and real-world fine-tuning [1]. However, a reality gap may remain leading to problematic
behaviour, as when a Tesla accelerated itself into motorbikes and a pedestrian in Japan [2]. Researchers have tried many approaches to make learning more reliable and develop agents robust to environmental changes, falling into three general categories: 1. The incorporation of biologically inspired systems such as Cellular Automata (CAs) and Artificial
Gene Regulatory Networks (AGRNs). These have exhibited very stable operation when transferred from simulated to real environments, even when starting conditions are varied [3]. 2. Enabling more sophisticated reasoning by combining reinforcement learning (RL) with model-based planning as in I2A [4] and MBVE [5], or with graphs as in Sanchez-Gonzalez's approach [6]. These
improve learning sample efficiency and, through a richer representation of their environment, operation with sparse rewards across long action sequences. 3. Recent research has also presented transformers and large language models (LLMs) in robot controller architectures such as RT-1 [7] and SayCan [8]. These agents can compose long action sequences
from natural language instruction and other tokenised inputs.
Heriot-Watt University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant