Loading…

Loading grant details…

Active NON-SBIR/STTR RPGS NIH (US)

Data science tools to identify robust exposure-phenotype associations for precision medicine

$6.98M USD

Funder NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES
Recipient Organization Harvard Medical School
Country United States
Start Date Sep 10, 2021
End Date Jun 30, 2026
Duration 1,754 days
Number of Grantees 2
Roles Co-Investigator; Principal Investigator
Data Source NIH (US)
Grant ID 10095924
Grant Description

Project Summary/Abstract Phenotypic variability across demographically diverse populations are driven by environmental factors.

The overall goal of this proposal is to deploy data science approaches to drive discovery of associations between exposures (E) and phenotypes (P) in demographically diverse populations.

We lack data science methods to associate, replicate, and prioritize exposure variables of the exposome (E) in phenotypes (P) and disease incidence (D), required for the delivery of precision medicine. Observational studies are fraught with 4 unsolved data science challenges.

First, E-based studies are: (1) limited to associating a few hypothesized exposure- phenotype pairs (E-P) at a time, leading to a fragmented literature of environmental associations.

Machine learning (ML) approaches for feature selection and prediction hold promise, however, (2) most extant E-based cohorts contain missing data, challenging the use of ML to detect complex E-P associations, Third, (3) biases, such as confounding and study design influence associations and hinder translation.

Fourth, (4) there are few well-powered data resources that systematically document longitudinal E-P and E-D associations across massive precision medicine.

It is a challenge to systematically associate a number of exposures in multiple phenotypes and replicate these associations across cohorts. (Aim 1).

The ?vibration of effects?, or the degree to which associations change as a function of study design (e.g., analytic method, sample size) and model choice is a hidden bias in observational studies (Aim 2). Third, an outstanding question is the degree to which environmental differences lead to health disparities.

To address these challenges and gaps, we propose to Aim 1: develop and test machine learning methods to associate multiple environmental exposure indicators with multiple phenotypes: EP-WAS.

We hypothesize that exposures will explain a significant amount of variation in phenotype in populations and will deposit all data and models in a novel EP-WAS Catalog. Aim 2: Quantitate how study design influences associations between exposure biomarkers and phenotype.

We will scale up, extend, and test a method called ?vibration of effects? (VoE) to measure how study criteria influences the stability of associations (how reproducible associations are as a function of analytic choice). Aim 3.

Leverage EP-WAS and VoE to disentangle biological, demographic, and environmental influences of phenotypic disparities in hypercholesterolemia.

We will deploy EP-WAS and VoE packaged libraries in the largest cohort study to partition phenotypic variation across demographic groups in factors for hypercholesterolemia.

We will equip the biomedical community with data science approaches for robust data-driven discovery and interpretation of exposure-phenotype factors in observational datasets, required for the identification of environmental health disparities.

For the first time, investigators will ascertain the collective role of the environment in heart disease at scale just in time for the All of Us program.

All Grantees

Harvard Medical School

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant