Loading…

Loading grant details…

Completed STANDARD GRANT National Science Foundation (US)

Collaborative Research: CISE-ANR: CNS Core: Small: Modeling Modern Network Traffic: From Data Representation to Automated Machine Learning

$2.5M USD

Funder National Science Foundation (US)
Recipient Organization Stanford University
Country United States
Start Date Oct 01, 2021
End Date Sep 30, 2024
Duration 1,095 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2124424
Grant Description

To successfully maintain and secure communications networks, operators need to monitor their behavior and investigate security, performance, and other problems as they arise. Recent advances in network protocols and applications present fundamental challenges for monitoring network traffic. Specifically, Internet traffic, from web traffic to Domain Name System (DNS) queries and responses, is becoming ubiquitously encrypted, obfuscating information that might otherwise be available for these tasks.

Additionally, network traffic is increasing in volume and rate, precluding detailed logging and analyzing individual packets or streams. Finally, the Internet is becoming more centralized, and many services have also become cloud-based, making it more difficult to identify applications or services according to fixed identifiers such as IP addresses and port numbers. Answering even basic questions about Internet traffic has thus become increasingly challenging.

This project seeks to develop techniques to regain visibility and insights into modern network traffic considering these trends. We address three research questions towards regaining visibility into modern network traffic. First, this project will study how to represent traffic data in ways that are amenable to modeling, and that could optimize models for both supervised and unsupervised modeling tasks.

We will explore the impact of representations across four dimensions: (1) timeseries representations; (2) representations across flows; (3) representations at higher layers; and (4) operations on compressed data. Second, we will build on our work on traffic data representation to develop a set of tools to automatically explore model and traffic representations tailored for network traffic problems.

Towards this goal, we will build a large-scale repository of labeled flows across several different applications and services as well as evaluate data representations that will be used to build statistical learning models about network traffic. Finally, we will use the software platforms and algorithms we build to design new techniques and tools for operators to solve the challenges that prevent them from transferring developed models from laboratory experiments to real-world deployments.

We will extend automated model selection to account for systems costs and real-world limitations; address the need to be able to determine when models become inaccurate and to distinguish model inaccuracies from problems that are inherent to the network; and improve model robustness by investigating general approaches for model transfer.

All software we create in this project will be publicly available and open source. Additionally, we plan to integrate the software systems into tutorials for the community, undergraduate and graduate courses, and outreach and education programs in the community, in collaboration with partners such as the University of Chicago's Office of Special Programs and Office of Civic Engagement.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Stanford University

Advertisement
Discover thousands of grant opportunities
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant