Active NON-SBIR/STTR RPGS NIH (US)

Tooling for accurately studying the epigenome along the human pangenome reference

$14.19M USD

Funder	NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Recipient Organization	University of Washington
Country	United States
Start Date	Sep 19, 2024
End Date	Aug 31, 2027
Duration	1,076 days
Number of Grantees	1
Roles	Principal Investigator
Data Source	NIH (US)
Grant ID	`10976065`

Grant Description

Project Summary/Abstract This proposal will provide the foundational tooling for understanding the function of the pan-genome reference through the accurate annotation of regulatory elements within the pan-genome. As the genetic component of the pan-genome reference comes into focus, the next challenge is understanding the functional relevance of genetic

variants within this reference. However, resolving this challenge requires tooling that enables users to: (1) get accurate epigenetic data into a pan-genome reference; and (2) use epigenetic data once it is in a pan-genome reference. This proposal leverages our team’s unique expertise in long-read epigenetics, short-read epigenetics,

pan-genome assembly, and genomic software development to develop transformative tooling for threading accurate epigenetic information into a pan-genome graph, as well as extracting epigenetic information from a pan-genome in a manner that is compatible with existing epigenetic and genetic analysis tools. Our tooling is

grounded in first assembling accurate epigenetic annotations at the level of haploid linear contigs, which are then threaded into a pan-genome reference. This approach significantly improves the accuracy by which both long- and short-read epigenetic features are mapped into a pan-genome, enables our tooling to readily adapt to new

pan-genomes, and enables user-generated epigenetic data to be incorporated into a pan-genome reference without having to remake the pan-genome reference itself. Importantly, we are designing this tooling to work for diverse types of epigenetic data acquired across sequencing platforms. In addition, this tooling will be available

through AnVIL, Conda, and other platforms, enabling users to readily adopt it into their own research pipelines. Specifically, in Aim 1 we will develop tooling that uses a semi-supervised machine learning approach to accurately classify long-read epigenetic data collected using diverse experimental methods and sequencing

platforms. In Aim 2, we will develop tooling that accurately aggregates long-read epigenetic data onto haploid linear contigs, and then threads either long-read or short-read epigenetic data into a pan-genome reference. In Aim 3, we will create fundamental operation tools for processing epigenetic data within a pan-genome to identify

epigenetic and genetic features at specific points of interest within a pan-genome in a sample-, path-, and read- aware manner. Finally, we will apply our tooling to existing long-read and short-read epigenetic datasets to identify genetic variants within the pan-genome reference associated with haplotype-, paralog-, and sample-

specific epigenetic features.

All Grantees

University of Washington

Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant

Tooling for accurately studying the epigenome along the human pangenome reference

Grant Description

All Grantees

Interested in applying for this grant?

Quick Summary

Related Grants