Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | North Carolina State University |
| Country | United States |
| Start Date | Sep 01, 2023 |
| End Date | Aug 31, 2026 |
| Duration | 1,095 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2310208 |
Technological advances make it possible to collect enormous amounts of data. Implications for how businesses run (online retailing, precision manufacturing, social media), how science is conducted (environmental science, climate modeling, chemoinformatics, biotechnology, engineering), and how governments operate (health care, public safety, homeland security, national defense, agriculture production) are correspondingly enormous.
For many uses of massive data sets, not all of the available information is relevant. For example, of the estimated 100,000 human genes, often only a handful are relevant to understanding a particular disease and developing a cure (the challenge is identifying the handful of relevant genes). A key feature in many big-data explorations is the identification and deemphasis of superfluous information with the corresponding identification and accentuation of the most relevant information (separating the wheat from the chaff).
Fractional Ridge Regression (FRR) is designed to improve both prediction and interpretability of statistical analyses of large data sets relative to statistical methods currently in use. FRR improves the identification and extraction of relevant information from large data sets thereby improving the many areas of business, science, and government policy that rely on the analysis and understanding of large data sets.
With nearly limitless applications, FRR research is ideal for engaging diverse statistics students in research projects. Computing algorithms and statistical software will make FRR available to researchers in all disciplines, thereby multiplying its potential benefits to education and diversity in numerous areas of data science. The investigator will identify sub-projects for undergraduate and graduate students with attention to student recruitment from under-represented groups.
Fractional ridge regression joins ridge regression and the lasso in the statistician's regression modeling toolbox. Ridge regression was introduced by Hoerl and Kennard in 1970 and twenty-six years later was followed by the introduction of the lasso by Tibshirani. The body of research ensuing from these seminal papers is staggering, and has contributed immensely to our understanding of shrinkage and selection methodology and to the practice of regression modeling in many areas of science.
In some applications of regression modeling the goal is simply to achieve the best possible predictions of future response values. In other applications, interpretation is important as a way to guide understanding of the process under investigation. Ridge regression is very good at prediction, although it is often eclipsed by the lasso in terms of both prediction and interpretation because the lasso also allows for selection.
Fractional ridge regression (FRR) improves both prediction (measured by mean square error) and interpretability (measured by variable selection specificity) relative to the lasso. FRR accomplishes these twin goals via a unique and clever penalty function that adaptively downweighs only a data-driven subset of regression model coefficients (a fraction), while allowing for the complementary subset of regression coefficients to vary freely in order to obtain an optimal fitted model.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
North Carolina State University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant