The cache complexity of multithreaded cache oblivious algorithms
M. Frigo, und V. Strumpen. Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, Seite 271--280. New York, NY, USA, ACM, (2006)
DOI: 10.1145/1148109.1148157
Zusammenfassung
We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a private cache. We specialize this technique to computations executed by the Cilk work-stealing scheduler on a machine with dag-consistent shared memory. We show that a multithreaded cache oblivious matrix multiplication incurs <i>O</i>(<i>n</i><sup>3</sup>/√<i>Z</i> + (<i>Pn</i>)<sup>1/3</sup><i>n</i><sup>2</sup>) cache misses when executed by the Cilk scheduler on a machine with <i>P</i> processors, each with a cache of size <i>Z</i>, with high probability. This bound is tighter than previously published bounds. We also present a new multithreaded cache oblivious algorithm for 1D stencil computations, which incurs <i>O</i>(<i>n</i><sup>2</sup>/<i>Z</i>+<i>n</i>+√<i>Pn</i><sup>3</sup>+<sup>ε</sup>) cache misses with high probability.
%0 Conference Paper
%1 Frigo:2006:CCM:1148109.1148157
%A Frigo, Matteo
%A Strumpen, Volker
%B Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
%C New York, NY, USA
%D 2006
%I ACM
%K cache.oblivious multithread parallel
%P 271--280
%R 10.1145/1148109.1148157
%T The cache complexity of multithreaded cache oblivious algorithms
%X We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a private cache. We specialize this technique to computations executed by the Cilk work-stealing scheduler on a machine with dag-consistent shared memory. We show that a multithreaded cache oblivious matrix multiplication incurs <i>O</i>(<i>n</i><sup>3</sup>/√<i>Z</i> + (<i>Pn</i>)<sup>1/3</sup><i>n</i><sup>2</sup>) cache misses when executed by the Cilk scheduler on a machine with <i>P</i> processors, each with a cache of size <i>Z</i>, with high probability. This bound is tighter than previously published bounds. We also present a new multithreaded cache oblivious algorithm for 1D stencil computations, which incurs <i>O</i>(<i>n</i><sup>2</sup>/<i>Z</i>+<i>n</i>+√<i>Pn</i><sup>3</sup>+<sup>ε</sup>) cache misses with high probability.
%@ 1-59593-452-9
@inproceedings{Frigo:2006:CCM:1148109.1148157,
abstract = {We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a private cache. We specialize this technique to computations executed by the Cilk work-stealing scheduler on a machine with dag-consistent shared memory. We show that a multithreaded cache oblivious matrix multiplication incurs <i>O</i>(<i>n</i><sup>3</sup>/√<i>Z</i> + (<i>Pn</i>)<sup>1/3</sup><i>n</i><sup>2</sup>) cache misses when executed by the Cilk scheduler on a machine with <i>P</i> processors, each with a cache of size <i>Z</i>, with high probability. This bound is tighter than previously published bounds. We also present a new multithreaded cache oblivious algorithm for 1D stencil computations, which incurs <i>O</i>(<i>n</i><sup>2</sup>/<i>Z</i>+<i>n</i>+√<i>Pn</i><sup>3</sup>+<sup>ε</sup>) cache misses with high probability.},
acmid = {1148157},
added-at = {2012-11-19T11:44:33.000+0100},
address = {New York, NY, USA},
author = {Frigo, Matteo and Strumpen, Volker},
biburl = {https://www.bibsonomy.org/bibtex/2e35f50a050710588cd1db0d82cb85429/ytyoun},
booktitle = {Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures},
doi = {10.1145/1148109.1148157},
interhash = {f7ab4e0a33dbf11ecd4cdba6ccbaa954},
intrahash = {e35f50a050710588cd1db0d82cb85429},
isbn = {1-59593-452-9},
keywords = {cache.oblivious multithread parallel},
location = {Cambridge, Massachusetts, USA},
numpages = {10},
pages = {271--280},
publisher = {ACM},
series = {SPAA '06},
timestamp = {2015-12-12T14:35:42.000+0100},
title = {The cache complexity of multithreaded cache oblivious algorithms},
year = 2006
}