[2012 OSDI] PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
One-line Summary
The key contributions are:
GAS (gather, apply, scatter) programming model
Using vertex cut instead of edge cut to layout data for power-law graphs
Balancing computation & minimizing communication
Paper Structure Outline
Introduction
Graph-Parallel Abstractions
Pregel
GraphLab
Characterization
Challenges of Natural Graphs
PowerGraph Abstraction
GAS Vertex-Programs
Delta Caching
Initiating Future Computation
Bulk Synchronous Execution
Asynchronous Execution
Comparison with GraphLab/Pregel
Distributed Graph Placement
Balanced p-way Vertex-Cut
Greedy Vertex-Cuts
Abstraction Comparison
Computation Imbalance
Communication Imbalance
Runtime Comparison
Implementation and Evaluation
Graph Loading and Placement
Synchronous Engine (Sync)
Asynchronous Engine (Async)
Async. Serializable Engine (Async+S)
Fault Tolerance
MLDM Applications
Related Work
Conclusions and Future Work
Background & Motivation
Background 1: Natural Graphs
Graphs IRL (e.g., social networks/the Internet) follow a power-law degree distribution
A small subset of the vertices have very high degrees, while most vertices have a small degree
Existing graph-parallel frameworks depend on a balanced degree distribution for performance
Background 2: Existing frameworks (Pregel, GraphLab) cannot handle natural graphs well
Work balancing: Existing graph-parallel frameworks treat vertices symmetrically and have storage/communication/computation costs linear in degree
Partitioning: Pregel/GraphLab depends on partitioning the graph, which is hard to do in natural graphs. Their solution, random partitioning, is bad.
Communication/storage: Major bottlenecks at high-degree vertices due to the skewed distribution
Computation: Existing frameworks do not parallelize individual vertex programs, limiting their scalability in skewed graphs
Design and Implementation
GAS Model
Gather: Information from adjacent vertices/edges is reduced by a generalized "sum" operation (commutative and associative)
Apply: The gathered sum is used with the current value to update the current vertex value
Scatter: The new value is used to update data on adjacent edges
Vertex-Cuts instead of Edge-Cuts
Edge-Cuts
Every vertex is placed on a machine, and edges span across machines
If adjacent vertices are on different machines, they use "ghost" vertices -> changes need to be synchronized to ghosts
In natural graphs, there are lots of edges spanned across machines; Balanced edge-cut algorithms perform poorly, so GraphLab and Pregel uses randomized placement (bad)
Vertex-Cuts
Every edge is placed on a machine, and vertices may be across machines
Intuition: The distribution of vertex degree is highly skewed, but the number of vertices adjacent to a given edge is constant (always 2)
Each vertex is replicated ("mirrors") across the machines where its adjacent edges lie
This results in a better balance for natural graphs
Delta Caching, Execution Model
Delta caching
At each vertex, the accumulator values are cached, and the scatter function can return a delta value to directly apply to the neighboring cached accumulator.
If this value is not returned, the neighboring cache is cleared
Execution model: Sync vs. Async
Sync (bulk synchronous)
3 "minor-steps": Gather for all active vertices -> Apply -> Scatter
Barrier after each minor-step; Changes are committed at the end of each minor-step and visible on the next
Async (asynchronous)
Changes are immediately available to other vertices
Execute active vertices as cores become available
Evaluation
Reduced vertex replication/communication costs
Links
Last updated