Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Carnegie-Mellon University |
| Country | United States |
| Start Date | Oct 01, 2024 |
| End Date | Sep 30, 2025 |
| Duration | 364 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2431265 |
In recent years, artificial intelligence (AI) has made major progress. Generative AI and, in particular, large language models (LLMs) have revolutionized how we think of AI. These methods are used broadly and are now also applied to program development.
Before LLMs, human contributors, commenters and curators would ensure at least some minimum level of correctness when looking up such code fragments on programming-related internet sites. These guard rails are gone when using LLMs, as the system cannot provide any guarantees and insights to ensure the correctness of the produced examples. This LLM hallucination becomes a huge issue when LLMs are asked to provide examples for algorithms with complex and hard-to-understand behavior and where correctness is not obvious.
In the worst-case scenario, generative AI provides incorrect and outdated code snippets from undocumented sources that are strung together into a “reasonably looking” example that may well be wrong in a subtle way, but only experts could spot the problem. Any effort that helps guard AI from acting outside its bounds and that ensures that one can trust in AI has a major scientific and societal impact. This project focuses on enabling trust when using AI as an aid to programmers.
This project develops an experimental approach to the above-described problem for a class of important core numerical algorithms. The team of researchers takes their inspiration from a well-understood insight in mathematics and computer science that proving a solution correct is easier than finding a solution. In the context of LLMs, they investigate how to utilize LLMs to guess an implementation of an algorithm, within clear implementation constraints.
Then, they develop an extension to the SPIRAL system (www.spiral.net) that implements symbolic execution and theorem-proving capabilities to derive the semantics of the LLM-generated code using SPIRAL’s formal framework and engine. For the problems addressed by this project, the researchers aim to demonstrate how rule-based symbolic AI can bring guarantees to generative AI.
While the problem of deriving the formal semantics of a piece of numerical code, in general, may well be unsolvable, for the class of algorithms that the SPIRAL system understands, the problem becomes tractable. The approach is a variant of lifting that leverages the detailed knowledge of numerical algorithms encoded in the rule-based AI components of the SPIRAL system.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Carnegie-Mellon University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant