Completed STANDARD GRANT National Science Foundation (US)

Doctoral Dissertation Research: Evaluating the Promise and Pitfalls of Benchmarking in Machine Learning Research

$200K USD

Funder	National Science Foundation (US)
Recipient Organization	University of California-Los Angeles
Country	United States
Start Date	Aug 01, 2021
End Date	Jul 31, 2023
Duration	729 days
Number of Grantees	2
Roles	Principal Investigator; Co-Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2124685`

Grant Description

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).

The scientific and commercial success of machine learning (ML) has spurred government and corporate sponsors to invest billions of dollars in machine learning research. Despite this massive investment, there is limited quantitative research on how the ML field measures progress: a process called “benchmarking.” Benchmarking is the act of comparing algorithms on a quantitative metric after training them on the same benchmark dataset.

Benchmarks organize ML researchers around common tasks. Achieving “state of the art” performance on an important benchmark can spark new research trajectories and advance careers: consider the 2012 success of “AlexNet” in a prominent computer vision task, which helped to launch current interest in deep learning. However, the practice of benchmarking has already engendered criticism that this near-ubiquitous research culture does not push the field towards socially beneficial outcomes, and leads to overinvestment in methods that maximize performance on academic datasets but are environmentally unsustainable or harm the public when used in the real world.

This dissertation research will provide a comprehensive analysis of the strengths and weaknesses of benchmarking practices with respect to several public aims: accelerating innovation in science, increasing equity within the field, and promoting ethical research (i.e., an orientation toward research that benefits society and avoids harms). By blending sociological analysis, computational methods for extracting and analyzing benchmarking data from thousands of papers, and in-depth qualitative interviews, this research will produce an understanding of benchmarking culture in ML research that combines breadth and quantitative rigor with depth and interpretive nuance.

This project has significant implications for government and corporate funders, researchers, and society more broadly.

The dissertation consists of three subprojects. The first subproject explores evidence that benchmarking culture has stymied innovation by favoring utilization of the same datasets across multiple tasks and by incentivizing researchers to underinvest on nascent benchmarks and overinvest on mature ones. The second subproject explores how patterns in the adoption of benchmarks and rewards for state-of-the-art performance interact with status and resources to create inequities in the field.

It tests the hypothesis that high-status researchers and institutions have disproportionate power to set the field’s research agenda by introducing benchmarks, while garnering disproportionate citations for state-of-the-art achievements. Both of these phenomena have the potential to create a “Matthew Effect” that disadvantages under-represented and under-resourced researchers/institutions.

These subprojects use network science, natural language processing, and manual coding to create a large dataset of benchmarks and progress on those benchmarks across multiple ML task communities. The third subproject consists of qualitative interviews with ML researchers across career stages and expertise to gain first-hand perspectives on benchmarking culture and assess reforms to improve research ethics and societal outcomes.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of California-Los Angeles

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

Doctoral Dissertation Research: Evaluating the Promise and Pitfalls of Benchmarking in Machine Learning Research

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants