Loading…
Loading grant details…
| Funder | Swedish Research Council |
|---|---|
| Recipient Organization | Kth, Royal Institute of Technology |
| Country | Sweden |
| Start Date | Jan 01, 2025 |
| End Date | Dec 31, 2028 |
| Duration | 1,460 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | Swedish Research Council |
| Grant ID | 2024-05873_VR |
Automated synthesis of speech-driven 3D gesture motion is a key technology for virtual agents, social robots, video game characters, and the metaverse.
However, gesture-synthesis models are lagging behind recent breakthroughs in generative models of text, images, and speech, due to a scarcity of 3D motion-capture (mocap) data.
Recording such data is costly, and the limited public datasets often use incompatible standards, complicating data integration.This project proposes an innovative approach to bypass the motion-data bottleneck, by instead leveraging human preference data from gesture-motion evaluations.
Such data is just starting to become available, but has hitherto gone essentially unused by the field, despite the vital role human-preference data (integrated, e.g., via reinforcement learning from human feedback) plays for models like GPT-4.Our research agenda will leverage human evaluation data to advance both automated assessment and synthesis of 3D gesture motion.
The outcome of our research can have substantial impact on the field, with improved gesture-assessment metrics reducing or eliminating the need for manual inspection of output gestures during system development, whilst integrating human preference data into model building may allow motion synthesis to surpass the quality of its mocap training material.
We will also use innovative techniques such as model merging to extend our findings to additional datasets and augment emerging foundation models of motion.
Kth, Royal Institute of Technology
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant