Communication-Optimal Distributed Dynamic Graph Clustering

Chun Jiang Zhu, Tan Zhu, Kam-Yiu Lam, Song Han, Jinbo Bi

We consider the problem of clustering graph nodes over large-scale dynamic graphs, such as citation networks, images and web networks, when graph updates such as node/edge insertions/deletions are observed distributively. We propose communication-efficient algorithms for two well-established communication models namely the message passing and the blackboard models. Given a graph with $n$ nodes that is observed at $s$ remote sites over time $[1,t]$, the two proposed algorithms have communication costs $\tilde{O}(ns)$ and $\tilde{O}(n+s)$ ($\tilde{O}$ hides a polylogarithmic factor), almost matching their lower bounds, $\Omega(ns)$ and $\Omega(n+s)$, respectively, in the message passing and the blackboard models. More importantly, we prove that at each time point in $[1,t]$ our algorithms generate clustering quality nearly as good as that of centralizing all updates up to that time and then applying a standard centralized clustering algorithm. We conducted extensive experiments on both synthetic and real-life datasets which confirmed the communication efficiency of our approach over baseline algorithms while achieving comparable clustering results.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment