Completed STANDARD GRANT National Science Foundation (US)

CISE Core: CCF: SHF: Small: Future-Proof Test Corpus Synthesis for Evolving Software

$5.46M USD

Funder	National Science Foundation (US)
Recipient Organization	Carnegie-Mellon University
Country	United States
Start Date	Oct 01, 2021
End Date	Sep 30, 2025
Duration	1,460 days
Number of Grantees	1
Roles	Principal Investigator
Data Source	National Science Foundation (US)
Grant ID	`2120955`

Grant Description

Modern software is complex and continuously evolving. For every small change to the code, there is a risk of introducing unintended consequences that can affect the software's correctness, security, and performance. To guard against such issues, known as regression bugs, developers must test their software on a diverse suite of program inputs after every code change.

However, manually hand-crafting such test inputs risks missing out on important corner cases. This research is developing techniques for automatically generating test inputs that guard against future regressions. The research will focus on automatically synthesizing test inputs that are easy to maintain, quick to execute, and robust at detecting faults introduced by small code changes.

The technology developed in this project is intended to help improve the reliability of critical software systems, cut down energy usage during development, and reduce technical debt. Furthermore, this research is also contributing to the investigator's ongoing efforts in developing reusable course material for undergraduate computer science education, in particular, by incorporating the automatic test-input generation technology in classroom programming assignments.

The project activities themselves will also provide research experience opportunities for a diverse cohort of undergraduate students.

Randomized test-input generation techniques such as grey-box fuzzing have been successful at uncovering critical bugs and security issues in widely used software. However, conventional fuzz testing requires generating billions of test inputs using hundreds of CPU-hours in order to be effective, which is impractical for continuously validating code changes.

This project shifts the focus of fuzz-testing research towards generating a reusable corpus of regression test inputs, in order to support software evolution. The research is focusing on optimizing the quality of the generated test inputs along three dimensions. First, an iterative ensemble fuzzing technique is being developed for synthesizing test inputs that are concise by construction.

Second, mutation analysis is being used to guide fuzzing towards synthesizing test inputs that are robust at detecting faults due to small code changes. Third, language modeling techniques are being used to learn common patterns in human-authored test inputs. The models are being used to develop novel fuzzing algorithms that can synthesize natural-looking test inputs that easier to maintain as the software evolves.

The results of this research are being disseminated in the form of open-source tools and publications that are intended to help software developers reduce maintenance costs and ultimately deploy more reliable software.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Carnegie-Mellon University

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

CISE Core: CCF: SHF: Small: Future-Proof Test Corpus Synthesis for Evolving Software

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants