Loading…

Loading grant details…

Active STANDARD GRANT National Science Foundation (US)

Collaborative Research: SHF: Medium: A Scalable Graph-Based Approach to Clustering

$3.64M USD

Funder National Science Foundation (US)
Recipient Organization Brown University
Country United States
Start Date Oct 01, 2024
End Date Sep 30, 2028
Duration 1,460 days
Number of Grantees 1
Roles Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2403236
Grant Description

Clustering algorithms are one of the most important modern tools for understanding data. Given data on various entities, clustering algorithms group entities into sets or "clusters" such that similar entities are likely to end up in the same cluster while dissimilar entities tend to end up in different clusters. For example, clustering algorithms can be used to group images together according to the contents of the image.

However, modern datasets are so large that many existing clustering algorithms cannot be feasibly used. This project aims to systematically address this situation by way of new clustering algorithms that scale to massive datasets with billions of entities. Clustering is widely used by scientists, companies, and government agencies.

The toolkit developed in the project will be open-sourced and will make scalable, high-performance clustering more broadly accessible to scientists and practitioners by improving the efficiency and programming productivity of their clustering tasks. Results from the project will be integrated into courses that the investigators teach, and the researchers will recruit undergraduate students to participate in the project.

This three-institution collaborative project investigates a new approach for clustering pointsets by constructing sparse graphs that preserve relevant properties of the pointset. By carefully leveraging high-quality near-linear work graph clustering algorithms, very large datasets can be clustered in time that is nearly linear to the number of objects in the input with high accuracy.

Particular attention will be paid to new algorithms for graph clustering and construction that utilize structure observed in practice, exploit parallelism, and enable dynamism with provable accuracy guarantees. A major contribution of the project will be an end-to-end clustering toolkit for graphs and pointsets that enables clustering to be scaled to inputs with billions of objects.

The investigators will collaborate through regular remote meetings and seminars, student visits, joint publications, and annual in-person workshops.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Brown University

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant