Author of the publication

Overlap Communication with Dependent Computation via Decomposition in Large Deep Learning Models.

, , , , , , , , , , , , , and . ASPLOS (1), page 93-106. ACM, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

High-Performance Distributed ML at Scale through Parameter Server Consistency Models., , , , , and . AAAI, page 79-87. AAAI Press, (2015)A software toolkit for visualizing enterprise routing design., , , and . SafeConfig, IEEE, (2011)Automating Dependence-Aware Parallelization of Machine Learning Training on Distributed Shared Memory., , , and . EuroSys, page 42:1-42:17. ACM, (2019)Managed communication and consistency for fast data-parallel iterative analytics., , , , , , , , and . SoCC, page 381-394. ACM, (2015)Addressing the straggler problem for iterative convergent parallel ML., , , , , , , and . SoCC, page 98-111. ACM, (2016)Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters., , , , , , , , , and . USENIX ATC, page 181-193. USENIX Association, (2017)Priority-based Parameter Propagation for Distributed DNN Training., , , , and . SysML, mlsys.org, (2019)Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads., , , , , , , , , and 1 other author(s). OSDI, page 495-514. USENIX Association, (2021)Exploiting iterative-ness for parallel ML computations., , , , , , , , , and 1 other author(s). SoCC, page 5:1-5:14. ACM, (2014)Petuum: A New Platform for Distributed Machine Learning on Big Data., , , , , , , , , and . KDD, page 1335-1344. ACM, (2015)