Today’s graphs used in domains such as machine learning or
social network analysis may contain hundreds of billions of
edges. Yet, they are not necessarily stored efficiently, and stan-
dard graph representations such as adjacency lists waste a
significant number of bits while graph compression schemes
such as WebGraph often require time-consuming decompres-
sion. To address this, we propose Log(Graph): a graph rep-
resentation that combines high compression ratios with very
low-overhead decompression to enable cheaper and faster
graph processing. The key idea is to encode a graph so that
the parts of the representation approach or match the re-
spective storage lower bounds. We call our approach “graph
logarithmization” because these bounds are usually loga-
rithmic. Our high-performance Log(Graph) implementation
based on modern bitwise operations and state-of-the-art suc-
cinct data structures achieves high compression ratios as well
as performance. For example, compared to the tuned Graph
Algorithm Processing Benchmark Suite (GAPBS), it reduces
graph sizes by 20-35% while matching GAPBS’ performance
or even delivering speedups due to reducing amounts of
transferred data. It approaches the compression ratio of the
established WebGraph compression library while enabling
speedups of up to more than 2
×
. Log(Graph) can improve
the design of various graph processing engines or libraries on
single NUMA nodes as well as distributed-memory systems.
Описание
-0.5emLog(Graph): A Near-OptimalHigh-Performance Graph Representation-0.5em - loggraph.pdf
%0 Journal Article
%1 endstream
%A Besta, Maciej
%A Stanojevic, Dimitri
%A Zivic, Tijana
%A Singh, Jagpreet
%A Hoerold, Maurice
%A Hoefler, Torsten
%B PACT ’18
%D 2018
%I ACM
%K Representation graph graphs storage
%T Log(Graph): A Near-Optimal
High-Performance Graph Representation
%U https://people.csail.mit.edu/jshun/papers/loggraph.pdf
%X Today’s graphs used in domains such as machine learning or
social network analysis may contain hundreds of billions of
edges. Yet, they are not necessarily stored efficiently, and stan-
dard graph representations such as adjacency lists waste a
significant number of bits while graph compression schemes
such as WebGraph often require time-consuming decompres-
sion. To address this, we propose Log(Graph): a graph rep-
resentation that combines high compression ratios with very
low-overhead decompression to enable cheaper and faster
graph processing. The key idea is to encode a graph so that
the parts of the representation approach or match the re-
spective storage lower bounds. We call our approach “graph
logarithmization” because these bounds are usually loga-
rithmic. Our high-performance Log(Graph) implementation
based on modern bitwise operations and state-of-the-art suc-
cinct data structures achieves high compression ratios as well
as performance. For example, compared to the tuned Graph
Algorithm Processing Benchmark Suite (GAPBS), it reduces
graph sizes by 20-35% while matching GAPBS’ performance
or even delivering speedups due to reducing amounts of
transferred data. It approaches the compression ratio of the
established WebGraph compression library while enabling
speedups of up to more than 2
×
. Log(Graph) can improve
the design of various graph processing engines or libraries on
single NUMA nodes as well as distributed-memory systems.
@article{endstream,
abstract = {Today’s graphs used in domains such as machine learning or
social network analysis may contain hundreds of billions of
edges. Yet, they are not necessarily stored efficiently, and stan-
dard graph representations such as adjacency lists waste a
significant number of bits while graph compression schemes
such as WebGraph often require time-consuming decompres-
sion. To address this, we propose Log(Graph): a graph rep-
resentation that combines high compression ratios with very
low-overhead decompression to enable cheaper and faster
graph processing. The key idea is to encode a graph so that
the parts of the representation approach or match the re-
spective storage lower bounds. We call our approach “graph
logarithmization” because these bounds are usually loga-
rithmic. Our high-performance Log(Graph) implementation
based on modern bitwise operations and state-of-the-art suc-
cinct data structures achieves high compression ratios as well
as performance. For example, compared to the tuned Graph
Algorithm Processing Benchmark Suite (GAPBS), it reduces
graph sizes by 20-35% while matching GAPBS’ performance
or even delivering speedups due to reducing amounts of
transferred data. It approaches the compression ratio of the
established WebGraph compression library while enabling
speedups of up to more than 2
×
. Log(Graph) can improve
the design of various graph processing engines or libraries on
single NUMA nodes as well as distributed-memory systems.},
added-at = {2018-09-29T18:21:04.000+0200},
author = {Besta, Maciej and Stanojevic, Dimitri and Zivic, Tijana and Singh, Jagpreet and Hoerold, Maurice and Hoefler, Torsten},
biburl = {https://www.bibsonomy.org/bibtex/25b7b94db9b6125988fad8c8a55a622c6/brightbyte},
booktitle = {PACT ’18},
description = {-0.5emLog(Graph): A Near-OptimalHigh-Performance Graph Representation-0.5em - loggraph.pdf},
institution = {Department of Computer Science, ETH Zurich},
interhash = {ae977e2ed6c2cb8954fcbd54868b3c2c},
intrahash = {5b7b94db9b6125988fad8c8a55a622c6},
keywords = {Representation graph graphs storage},
publisher = {ACM},
timestamp = {2018-09-29T18:21:04.000+0200},
title = {Log(Graph): A Near-Optimal
High-Performance Graph Representation},
url = {https://people.csail.mit.edu/jshun/papers/loggraph.pdf},
year = 2018
}