Loading…

Loading grant details…

Active NON-SBIR/STTR RPGS NIH (US)

Large-scale integrated data analysis of lymphocyte receptor repertoires with workflows

$5M USD

Funder NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES
Recipient Organization Yale University
Country United States
Start Date Aug 09, 2024
End Date May 31, 2027
Duration 1,025 days
Number of Grantees 1
Roles Principal Investigator
Data Source NIH (US)
Grant ID 10948588
Grant Description

Project Summary / Abstract Over the last decade, high-throughput B cell and T cell receptor repertoire sequencing has become a fundamental method for investigating adaptive immune responses. The Immcantation framework, consisting of open-source Python and R packages, provides a comprehensive analytical ecosystem for this Adaptive Immune

Receptor Repertoire sequencing (AIRR-seq) data analysis, covering critical steps like pre-processing, clonal relationship identification, lineage reconstruction, and somatic hypermutation analysis. This framework has gained widespread usage in infectious and immune-mediated disease research, with over 100,000 downloads

in 2022. However, as sequencing technologies advance and datasets grow larger, there is a need for scalable computational workflows combining the individual analysis steps. To meet this demand and support the expanding user and developer community, we developed nf-core/airrflow (AIRRflow), a Nextflow workflow that

integrates the individual Immcantation tools into a high-throughput analysis pipeline. The workflow offers parallelization, scalability, and compatibility with diverse computing infrastructures, including High-Performance Computing (HPC) clusters and commercial clouds. It is part of the nf-core project, a community-driven effort

collecting Nextflow pipelines with an emphasis on robustness and reproducibility. This proposal aims to enhance AIRRflow usability, findability, accessibility, interoperability and scalability to cater to a broader audience in the infectious and immune-mediated disease (IID) research community. The proposed aims include adding new

functionality to handle the integration of data from large numbers of subjects and facilitate interpretability, including embedding methods that translate receptor sequences to length-independent numerical vectors suitable for machine learning, determination of convergent responses across infectious and immune-mediated

diseases, and annotation of receptor specificity leveraging public databases like IEDB. To enhance accessibility of data from public databases and ensure compliance with FAIR software principles, we will include automated data download from the Sequence Read Archive (SRA) and ImmPort, expand the supported data types to

RNAseq and single-cell RNAseq, implement scalability tests, and make the workflow metadata accessible through suitable portals like the NIAID Data Ecosystem Discovery Portal. We will actively work towards expanding the user base by offering hands-on trainings, tutorials targeting relevant use cases for the IID research

community, and community engagement events, gathering feedback through multiple channels including surveys, GitHub issue tracking and slack. These improvements will make AIRRflow an even more valuable resource for researchers in the IID community.

All Grantees

Yale University

Advertisement
Discover thousands of grant opportunities
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant