Completed STANDARD GRANT National Science Foundation (US)

Collaborative Research: Loopholes as a window into the learning of meaning

$3.72M USD

Funder	National Science Foundation (US)
Recipient Organization	Harvard University
Country	United States
Start Date	Sep 01, 2021
End Date	Aug 31, 2024
Duration	1,095 days
Number of Grantees	2
Roles	Principal Investigator; Co-Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2118096`

Grant Description

Intelligent machines could help achieve major human goals. But even current state-of-the-art machines can catastrophically misunderstand what they were asked to do, resulting in machines that 'do what you asked, but not what you want'. In contrast to these failures of human-machine interactions, from an early age humans can quickly and efficiently communicate their goals, and find ways to cooperate and help.

But when people's values do not align, they find ‘loopholes’ to avoid cooperating or complying. Loopholes offer a unique window into the successful but opaque commonsense process of goal understanding. While loopholes are a pervasive everyday concern with real world implications, there is little computational or cognitive research examining this phenomenon.

This project means to study the mental processes that allow humans to intuitively and purposefully contort communication in loophole-behavior. This research will help tackle central open challenges in the design of safe intelligent machines and human-technology interactions, and will improve our understanding of the emergence of social interactions.

Previous research has focused on how children learn to communicate socially and negotiate values, but not on how children and adults handle and exploit value misalignment. This raises a crucial question for cognitive science and human-machine interactions: how do people learn to go from ambiguous communication to the alignment of intended goals, plausible alternatives, and one’s own values?

Studies of development are particularly important in answering this question, as the developmental trajectory sheds light on which processes are foundational to this ability and which are brought in piecemeal with greater knowledge and experience. The project combines methods from AI, computational cognitive science, and social cognitive development, and will (1) characterize the emergence and scope of loopholes in the wild with large open databases using citizen-science and public data, (2) build a formal framework informed by the data for modeling the interpretation and (mis)alignment of social goals from sparse statements, (3) validate the framework using controlled experiments with diverse populations to study the evaluation of loophole-seeking from childhood to adulthood, and (4) extend this framework with novel studies on the inferred goals of machines in human-machine interactions.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Harvard University

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

Collaborative Research: Loopholes as a window into the learning of meaning

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants