<par>This paper studies the problem of efficiently schedulling fully strict (i.e., well-structured) multithreaded computations on parallel computers. A popular and practical method of scheduling this kind of dynamic MIMD-style computation is “work stealing,” in which processors needing work steal computational threads from other processors. In this paper, we give the first provably good work-stealing scheduler for multithreaded computations with dependencies.</par><par>Specifically, our analysis shows that the expected time to execute a fully strict computation on <italic>P</italic> processors using our work-stealing scheduler is <italic>T</italic><subscrpt>1</subscrpt>/<italic>P</italic> + <italic>O</italic>(<italic>T</italic><subscrpt><inline-equation> <f> ∞ </f></inline-equation></subscrpt>, where <italic>T</italic><subscrpt>1</subscrpt> is the minimum serial execution time of the multithreaded computation and (<italic>T</italic><subscrpt><inline-equation> <f> ∞</f></inline-equation></subscrpt> is the minimum execution time with an infinite number of processors. Moreover, the space required by the execution is at most <italic>S</italic><subscrpt>1</subscrpt><italic>P</italic>, where <italic>S</italic><subscrpt>1</subscrpt> is the minimum serial space requirement. We also show that the expected total communication of the algorithm is at most <italic>O</italic>(<italic>PT</italic><subscrpt><inline-equation> <f> ∞</f> </inline-equation></subscrpt>( 1 + <italic>n<subscrpt>d</subscrpt></italic>)<italic>S</italic><subscrpt>max</subscrpt>), where <italic>S</italic><subscrpt>max</subscrpt> is the size of the largest activation record of any thread and <italic>n<subscrpt>d</subscrpt></italic> is the maximum number of times that any thread synchronizes with its parent. This communication bound justifies the folk wisdom that work-stealing schedulers are more communication efficient than their work-sharing counterparts. All three of these bounds are existentially optimal to within a constant factor.</par>
%0 Journal Article
%1 Blumofe:1999:SMC:324133.324234
%A Blumofe, Robert D.
%A Leiserson, Charles E.
%C New York, NY, USA
%D 1999
%I ACM
%J J. ACM
%K multithread parallel scheduling
%N 5
%P 720--748
%R 10.1145/324133.324234
%T Scheduling multithreaded computations by work stealing
%V 46
%X <par>This paper studies the problem of efficiently schedulling fully strict (i.e., well-structured) multithreaded computations on parallel computers. A popular and practical method of scheduling this kind of dynamic MIMD-style computation is “work stealing,” in which processors needing work steal computational threads from other processors. In this paper, we give the first provably good work-stealing scheduler for multithreaded computations with dependencies.</par><par>Specifically, our analysis shows that the expected time to execute a fully strict computation on <italic>P</italic> processors using our work-stealing scheduler is <italic>T</italic><subscrpt>1</subscrpt>/<italic>P</italic> + <italic>O</italic>(<italic>T</italic><subscrpt><inline-equation> <f> ∞ </f></inline-equation></subscrpt>, where <italic>T</italic><subscrpt>1</subscrpt> is the minimum serial execution time of the multithreaded computation and (<italic>T</italic><subscrpt><inline-equation> <f> ∞</f></inline-equation></subscrpt> is the minimum execution time with an infinite number of processors. Moreover, the space required by the execution is at most <italic>S</italic><subscrpt>1</subscrpt><italic>P</italic>, where <italic>S</italic><subscrpt>1</subscrpt> is the minimum serial space requirement. We also show that the expected total communication of the algorithm is at most <italic>O</italic>(<italic>PT</italic><subscrpt><inline-equation> <f> ∞</f> </inline-equation></subscrpt>( 1 + <italic>n<subscrpt>d</subscrpt></italic>)<italic>S</italic><subscrpt>max</subscrpt>), where <italic>S</italic><subscrpt>max</subscrpt> is the size of the largest activation record of any thread and <italic>n<subscrpt>d</subscrpt></italic> is the maximum number of times that any thread synchronizes with its parent. This communication bound justifies the folk wisdom that work-stealing schedulers are more communication efficient than their work-sharing counterparts. All three of these bounds are existentially optimal to within a constant factor.</par>
@article{Blumofe:1999:SMC:324133.324234,
abstract = {<par>This paper studies the problem of efficiently schedulling fully strict (i.e., well-structured) multithreaded computations on parallel computers. A popular and practical method of scheduling this kind of dynamic MIMD-style computation is “work stealing,” in which processors needing work steal computational threads from other processors. In this paper, we give the first provably good work-stealing scheduler for multithreaded computations with dependencies.</par><par>Specifically, our analysis shows that the expected time to execute a fully strict computation on <italic>P</italic> processors using our work-stealing scheduler is <italic>T</italic><subscrpt>1</subscrpt>/<italic>P</italic> + <italic>O</italic>(<italic>T</italic><subscrpt><inline-equation> <f> ∞ </f></inline-equation></subscrpt>, where <italic>T</italic><subscrpt>1</subscrpt> is the minimum serial execution time of the multithreaded computation and (<italic>T</italic><subscrpt><inline-equation> <f> ∞</f></inline-equation></subscrpt> is the minimum execution time with an infinite number of processors. Moreover, the space required by the execution is at most <italic>S</italic><subscrpt>1</subscrpt><italic>P</italic>, where <italic>S</italic><subscrpt>1</subscrpt> is the minimum serial space requirement. We also show that the expected total communication of the algorithm is at most <italic>O</italic>(<italic>PT</italic><subscrpt><inline-equation> <f> ∞</f> </inline-equation></subscrpt>( 1 + <italic>n<subscrpt>d</subscrpt></italic>)<italic>S</italic><subscrpt>max</subscrpt>), where <italic>S</italic><subscrpt>max</subscrpt> is the size of the largest activation record of any thread and <italic>n<subscrpt>d</subscrpt></italic> is the maximum number of times that any thread synchronizes with its parent. This communication bound justifies the folk wisdom that work-stealing schedulers are more communication efficient than their work-sharing counterparts. All three of these bounds are existentially optimal to within a constant factor.</par>},
acmid = {324234},
added-at = {2012-09-21T23:03:25.000+0200},
address = {New York, NY, USA},
author = {Blumofe, Robert D. and Leiserson, Charles E.},
biburl = {https://www.bibsonomy.org/bibtex/2d86aaae2a8fa0fc535c38502537d8eea/ytyoun},
doi = {10.1145/324133.324234},
interhash = {927c63631957e6848a736b69801dd3fb},
intrahash = {d86aaae2a8fa0fc535c38502537d8eea},
issn = {0004-5411},
issue_date = {Sept. 1999},
journal = {J. ACM},
keywords = {multithread parallel scheduling},
month = sep,
number = 5,
numpages = {29},
pages = {720--748},
publisher = {ACM},
timestamp = {2015-12-12T14:36:21.000+0100},
title = {Scheduling multithreaded computations by work stealing},
volume = 46,
year = 1999
}