Active STANDARD GRANT National Science Foundation (US)

I-Corps: Translation Potential of a Novel Statistical Framework for Validating Large Language Model Outputs

$500K USD

Funder	National Science Foundation (US)
Recipient Organization	University of Wisconsin-Madison
Country	United States
Start Date	Apr 01, 2025
End Date	Mar 31, 2026
Duration	364 days
Number of Grantees	1
Roles	Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2520297`

Grant Description

The significance of this I-Corps project is based on the translation from lab to market of a new method for validating artificial intelligence (AI) outputs that will help industries adopt trustworthy and reliable AI technologies. The solution generates calibrated confidence scores, indicating when decision-makers can rely on AI outputs. The fundamental issue this solution seeks to address is how to ensure AI produces trustworthy responses when used in industry applications, making it safer and more practical to use AI systems across different industries.

The potential impact of commercializing this technology is more widespread adoption of AI tools in critical situations, enabling decision-makers to act more confidently, especially when time is critical. By bringing this advance to the marketplace, the gap between AI's potential and its real-world use could be bridged, helping ensure U.S. dominance in the AI industry worldwide.

This I-Corps project utilizes experiential learning coupled with a first-hand investigation of the industry ecosystem to assess the translation potential of a novel statistical framework for validating Large Language Model (LLM) outputs in settings where traditional benchmarking is impossible or unreliable due to the lack of data or the presence of human disagreement. The solution employs advanced statistical techniques to quantify uncertainty in AI-generated responses by analyzing patterns of agreement and disagreement between multiple model runs and between models and human experts.

The approach differs from traditional model validation methods that require a gold standard for comparison and only offer an average accuracy level, instead of being individualized to the exact output. By generating calibrated confidence scores that correlate with accuracy, the technology enables users to make informed decisions about when to trust AI outputs, even in high-stakes domains requiring significant human judgment like healthcare and finance.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of Wisconsin-Madison

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

I-Corps: Translation Potential of a Novel Statistical Framework for Validating Large Language Model Outputs

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants