Loading…

Loading grant details…

Completed STANDARD GRANT National Science Foundation (US)

NSF-AoF: CNS Core: Small: Reinforcement Learning for Real-time Wireless Scheduling and Edge Caching: Theory and Algorithm Design

$4.15M USD

Funder National Science Foundation (US)
Recipient Organization University of California-Davis
Country United States
Start Date Oct 01, 2021
End Date Sep 30, 2025
Duration 1,460 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2203239
Grant Description

Recent years have witnessed a tremendous growth in real-time applications in wirelessly networked systems, such as connected cars and multi-user augmented reality (AR). Wireless edge caching is another emerging application requiring high bandwidth, where optimal caching decisions would depend on the cache contents and dynamic user demand profiles. To meet the explosive demand, 5G and Beyond (B5G) technology promises to offer enhanced mobile broadband (eMBB) and ultra-reliable low-latency communications (URLLC) services.

Meeting URLLC requirements is very challenging in wireless networks, and requires massive modifications to the current wireless system design. Deadline-aware wireless scheduling of real-time traffic has been a long-standing open problem. This collaborative project makes a paradigm shift to tackle these challenges thus spurring a new line of thinking for QoS guarantee in terms of ultra-low latency and high bandwidth in a variety of IoT applications, including B5G, autonomous driving, augmented reality, smart health and smart city, benefiting both the US and Finland.

The proposed research will also be integrated with education activities at the PIs' institutions for graduate, undergraduate, and K-12 students via curriculum development, research experiences, and outreach.

This project leverages recent advances on offline reinforcement learning (RL) to study two important problems in B5G, namely 1) deadline-aware wireless scheduling to guarantee low latency and 2) edge caching to achieve high bandwidth content delivery. In Thrust 1, physics-aided offline RL will be devised to train deadline-aware scheduling policies. Specifically, the Actor-Critic (A-C) method will be used for offline training of scheduling policies, consisting of two phases: 1) initialization of Actor structure via behavioral cloning and 2) policy improvement via the physics-aided A-C method.

With a good model-based scheduling algorithm as the initial actor structure, the A-C method can be leveraged to yield a better scheduling policy, thanks to its nature of policy improvement. Further, innovative algorithms will be devised to address the outstanding problems in the A-C method, namely overestimation bias and high variance, and Meta-RL will be used for adaptation to distribution shift in nonstationary network dynamics.

Thrust 2 focuses on wireless edge caching, an application where the storage capacities at both the network edge and user devices are harnessed to alleviate the need of high-bandwidth communications over long distances. The combinatorial nature of joint communication and caching optimization herein, with the uncertainties of system dynamics, calls for non-trivial design of machine learning algorithms. The PIs will leverage RL to investigate wireless edge caching thoroughly.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of California-Davis

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant