Loading…
Loading grant details…
| Funder | NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES |
|---|---|
| Recipient Organization | Temple University of the Commonwealth |
| Country | United States |
| Start Date | Feb 01, 2021 |
| End Date | Jan 31, 2026 |
| Duration | 1,825 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | NIH (US) |
| Grant ID | 11099368 |
Summary The parent R35 research program aims to develop innovative methods and tools for the comparative analysis of molecular sequences. The focus is on creating machine-learning methods to perform big data analytics, gaining biological insights, and comparing these with traditional model-based methods in molecular evolution and phylogenetics. A key development in
this program is the Evolutionary Sparse Learning (ESL) framework, designed to enhance molecular evolutionary analyses. Although ESL has been benchmarked against classical methods using high-performance computing (HPC) resources, benchmarking against advanced deep learning (DL) approaches remains infeasible due to the need for substantial computational
power. To address this, we request a Graphics Processing Unit (GPU) cluster to enable DL analyses essential for advancing our research. Two major example projects highlight the need for this system. The first project focuses on discovering fragile clades and causal sequences in phylogenomics. We have developed metrics for gene-species sequence concordance and clade
probability using ESL models, validated across many phylogenomic datasets. Benchmarking these ESL methods against DL approaches, such as MSA Transformer, is crucial. MSA Transformer captures phylogenetic relationships using multiple sequence alignments (MSAs) but requires refinement for orthologous protein sets, demanding a powerful GPU system. The second
project aims to uncover molecular convergences that parallel organismal convergent evolution. Using ESL, we have built genetic models to understand the independent origins of traits such as C4 photosynthesis in grasses and echolocation in mammals. Benchmarking revealed that current methods, including ESL, are limited in detecting convergences involving different residues at
different sites. Therefore, we are developing ESL approaches leveraging DL-generated protein embeddings to infer non-identical sequence convergence. Fine-tuning general DL models for orthologous sequences requires a dedicated GPU cluster, as existing resources are inadequate for the extensive analyses needed. The requested GPU cluster is essential for refining these DL
models and conducting comprehensive analyses, enhancing the impact and scope of our parent grant. Our experienced team and institutional support ensure effective use and maintenance of the equipment, promoting continued advancements in molecular evolutionary analysis.
Temple University of the Commonwealth
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant