Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Wisconsin-Madison |
| Country | United States |
| Start Date | Apr 01, 2025 |
| End Date | Mar 31, 2026 |
| Duration | 364 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2520297 |
The significance of this I-Corps project is based on the translation from lab to market of a new method for validating artificial intelligence (AI) outputs that will help industries adopt trustworthy and reliable AI technologies. The solution generates calibrated confidence scores, indicating when decision-makers can rely on AI outputs. The fundamental issue this solution seeks to address is how to ensure AI produces trustworthy responses when used in industry applications, making it safer and more practical to use AI systems across different industries.
The potential impact of commercializing this technology is more widespread adoption of AI tools in critical situations, enabling decision-makers to act more confidently, especially when time is critical. By bringing this advance to the marketplace, the gap between AI's potential and its real-world use could be bridged, helping ensure U.S. dominance in the AI industry worldwide.
This I-Corps project utilizes experiential learning coupled with a first-hand investigation of the industry ecosystem to assess the translation potential of a novel statistical framework for validating Large Language Model (LLM) outputs in settings where traditional benchmarking is impossible or unreliable due to the lack of data or the presence of human disagreement. The solution employs advanced statistical techniques to quantify uncertainty in AI-generated responses by analyzing patterns of agreement and disagreement between multiple model runs and between models and human experts.
The approach differs from traditional model validation methods that require a gold standard for comparison and only offer an average accuracy level, instead of being individualized to the exact output. By generating calibrated confidence scores that correlate with accuracy, the technology enables users to make informed decisions about when to trust AI outputs, even in high-stakes domains requiring significant human judgment like healthcare and finance.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Wisconsin-Madison
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant