Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Massachusetts Amherst |
| Country | United States |
| Start Date | Aug 15, 2021 |
| End Date | Jul 31, 2025 |
| Duration | 1,446 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2113079 |
Scientists in many fields, including genetics, neuroscience, ecology, and economics, are obtaining richer measurements of more complex processes than ever before. This offers thrilling opportunities to answer scientific questions that were previously out of reach. For instance, measurements of disease prevalence collected at many locations over time provide the opportunity to estimate the spread of disease.
Latent variable models, which model the observed data as a simple transformation of unobserved latent random variables, are a popular approach to extracting answers to scientific questions from measurements of complex processes. They are flexible, but they can also be computationally prohibitive. As a result, scientists using latent variable models may have to settle for approximations of unknown quality or ad-hoc simplifications that provide poor estimates or fail to answer questions of interest.
This project aims to develop novel methods for fitting latent variable models that are more computationally efficient, reliable, and accessible. Involvement in the project will train statisticians at all levels, with a focus on statisticians from populations that are underrepresented in statistics research. Specifically, the investigator will supervise graduate student involvement in the research, guide the development of engaging outreach materials by undergraduate students, participate in outreach at local high schools, and lead writing groups for early-career faculty.
A variety of computational challenges arise when fitting latent variable models. It can be difficult to characterize and simulate from the conditional distribution of the latent variables given observed data, even when the small number of parameters characterizing the latent variable model are known. Furthermore, it can be difficult to estimate the unknown parameters of a latent variable model because the likelihood of the data corresponds to a high dimensional integral for which a closed-form expression may be unavailable or expensive to evaluate.
Even when feasible methods are available, it can be difficult for practitioners to implement latent variable models without access to open-source software and detailed tutorials. Accordingly, this project aims to contribute (i) novel methods for simulating from the conditional distributions of high dimensional latent variables, (ii) improved methods for maximum likelihood estimation of latent variable model parameters, and (iii) versatile statistical software that allows practitioners to implement them.
Regarding (i), the PI plans to develop novel pathwise methods for simulating from the conditional distributions of high dimensional latent variables given data that leverage the relationship of the target conditional distribution to related or approximate distributions. Regarding (ii), the PI will develop improved methods for maximum likelihood estimation of latent variable model parameters that leverage the pathwise simulation methods introduced in the first aim.
Regarding (iii), the PI will apply the new methods to disease mapping and genome-wide association studies and develop R packages that allow other practitioners to implement the methods.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Massachusetts Amherst
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant