Loading…

Loading grant details…

Active CONTINUING GRANT National Science Foundation (US)

CAREER: Embracing Uncertainty in High-Performance Computing Resource Scheduling: An Integrated Algorithmic and Machine Learning-based Approach

$3.24M USD

Funder National Science Foundation (US)
Recipient Organization University of Kansas Center for Research Inc
Country United States
Start Date Oct 01, 2025
End Date Sep 30, 2030
Duration 1,825 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2441633
Grant Description

Resource scheduling is a critical component of high-performance computing (HPC) systems. Despite extensive literature on scheduling, new challenges continue to arise due to advancements in hardware, software, and evolving models, metrics, and performance demands. Today’s HPC systems operate on an unprecedented scale, presenting significant challenges for resource management, particularly when facing uncertainty introduced by emerging application characteristics and system-level complexities.

Existing schedulers lack robust mechanisms to effectively handle uncertainty, limiting their ability to achieve optimal performance. This project takes on the grand challenge of scheduling HPC resources under uncertainty by introducing an integrated approach that combines algorithm and machine learning (ML). The approach leverages the rigor of algorithmic analysis to provide performance guarantees while utilizing ML’s predictive capabilities to manage uncertainty effectively.

The anticipated outcome is a substantial enhancement to current HPC schedulers, enabling more efficient execution of a diverse range of scientific applications, such as neuroscience, medical research, climate modeling, and artificial intelligence. Additionally, the project includes a series of synergistic activities, including outreach programs, curriculum development, and student recruitment, aimed at engaging students from K-12 through graduate levels.

These efforts focus particularly on underrepresented and underserved communities, offering research opportunities that foster success in STEM and CS education.

Technically, this project aims to design, implement, and evaluate scheduling algorithms that integrate ML prediction models to enhance efficiency. The focus will be on addressing three primary sources of uncertainty: (1) inherent runtime variability of emerging applications; (2) resource contention in job co-scheduling; and (3) structural variations within dynamic workflows.

These aspects represent uncertainties across temporal, spatial, and structural dimensions, all of which demand solutions due to their growing prevalence in modern HPC environments. Algorithmically, approximation and semi-online algorithms will be developed to provide performance guarantees relative to theoretical lower bounds for metrics such as job completion time and resource utilization.

On the ML front, various models, including those based on regression and reinforcement learning, will be trained to deliver accurate predictions for job runtime, performance degradation, and structural variability. A key ambition of this project is to establish an incubation framework that enables the effective integration of heuristic-based algorithms and data-driven ML models.

This approach aims to achieve a level of performance that neither paradigm could accomplish independently. The framework will offer a novel perspective on resource management and potentially set the stage for future HPC advancements.

This project is jointly funded by Software and Hardware Foundations and the Established Program to Stimulate Competitive Research (EPSCoR).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of Kansas Center for Research Inc

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant