Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | Colorado State University |
| Country | United States |
| Start Date | Sep 01, 2023 |
| End Date | Aug 31, 2026 |
| Duration | 1,095 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2245492 |
The collection and analysis of microbiome data have broad implications for furthering our understanding of human health and performance, agriculture, and ecology, among other areas. Human microbiome research, for example, aims to better understand the role of our microbial communities and how they interact with their host, respond to their environment, and influence disease.
In addition to microbiome data being compositional, as the sum of the microbial taxa reads is fixed, and high-dimensional, they are also zero-inflated, as there are typically more zero reads observed than expected, which has profound implications on modeling and inference. This project aims to advance statistical methods and computational algorithms for the analysis of zero-inflated multivariate compositional count data.
While developed to address the current challenges of microbiome data analysis, the methods will be generally applicable to other settings in which multivariate compositional count data with excess zeros are observed, including biomedical and public health research, econometrics, and ecology. The project will additionally provide educational and professional training and mentoring to graduate students.
Analyzing multivariate count data generated by high-throughput sequencing technology in omics research is challenging due to the high-dimensional and compositional structure of the data, over-dispersion, and potential zero inflation. In practice, researchers often use the Dirichlet-multinomial (DM) distribution and its variants to model these data. However, under the assumptions of a DM model, estimated probabilities for zero counts are strictly positive even if the true probability of occurrence is zero.
This research project aims to develop a novel sparse DM (sDM) model which allows zero count probabilities to take on zero values to simultaneously accommodate potential zero inflation in multivariate compositional count data while estimating compositional probabilities. Additionally, this project will investigate extensions of the sDM modeling framework to high-dimensional variable selection and clustering problems and contribute Markov chain Monte Carlo algorithms for posterior inference that will be made publicly available to practitioners and other researchers.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Colorado State University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant