Loading…
Loading grant details…
| Funder | NATIONAL HUMAN GENOME RESEARCH INSTITUTE |
|---|---|
| Recipient Organization | University of Washington |
| Country | United States |
| Start Date | Sep 19, 2024 |
| End Date | Aug 31, 2027 |
| Duration | 1,076 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | NIH (US) |
| Grant ID | 10976065 |
Project Summary/Abstract This proposal will provide the foundational tooling for understanding the function of the pan-genome reference through the accurate annotation of regulatory elements within the pan-genome. As the genetic component of the pan-genome reference comes into focus, the next challenge is understanding the functional relevance of genetic
variants within this reference. However, resolving this challenge requires tooling that enables users to: (1) get accurate epigenetic data into a pan-genome reference; and (2) use epigenetic data once it is in a pan-genome reference. This proposal leverages our team’s unique expertise in long-read epigenetics, short-read epigenetics,
pan-genome assembly, and genomic software development to develop transformative tooling for threading accurate epigenetic information into a pan-genome graph, as well as extracting epigenetic information from a pan-genome in a manner that is compatible with existing epigenetic and genetic analysis tools. Our tooling is
grounded in first assembling accurate epigenetic annotations at the level of haploid linear contigs, which are then threaded into a pan-genome reference. This approach significantly improves the accuracy by which both long- and short-read epigenetic features are mapped into a pan-genome, enables our tooling to readily adapt to new
pan-genomes, and enables user-generated epigenetic data to be incorporated into a pan-genome reference without having to remake the pan-genome reference itself. Importantly, we are designing this tooling to work for diverse types of epigenetic data acquired across sequencing platforms. In addition, this tooling will be available
through AnVIL, Conda, and other platforms, enabling users to readily adopt it into their own research pipelines. Specifically, in Aim 1 we will develop tooling that uses a semi-supervised machine learning approach to accurately classify long-read epigenetic data collected using diverse experimental methods and sequencing
platforms. In Aim 2, we will develop tooling that accurately aggregates long-read epigenetic data onto haploid linear contigs, and then threads either long-read or short-read epigenetic data into a pan-genome reference. In Aim 3, we will create fundamental operation tools for processing epigenetic data within a pan-genome to identify
epigenetic and genetic features at specific points of interest within a pan-genome in a sample-, path-, and read- aware manner. Finally, we will apply our tooling to existing long-read and short-read epigenetic datasets to identify genetic variants within the pan-genome reference associated with haplotype-, paralog-, and sample-
specific epigenetic features.
University of Washington
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant