Author of the publication

Accelerating large-scale distributed neural network training with SPMD parallelism.

, , , , and . SoCC, page 403-418. ACM, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

PAI-FCNN: FPGA Based Inference System for Complex CNN Models., , , , , , , , , and 1 other author(s). ASAP, page 107-114. IEEE, (2019)DAPPLE: A Pipelined Data Parallel Approach for Training Large Models., , , , , , , , , and 3 other author(s). CoRR, (2020)Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform., , , , , , , , , and . CoRR, (2023)Optimizing distributed training deployment in heterogeneous GPU clusters., , , , , , , , and . CoNEXT, page 93-107. ACM, (2020)DISC: A Dynamic Shape Compiler for Machine Learning Workloads., , , , , , , , , and . EuroMLSys@EuroSys, page 89-95. ACM, (2021)Expediting Distributed DNN Training with Device Topology-Aware Graph Deployment., , , , , and . CoRR, (2023)Accelerating large-scale distributed neural network training with SPMD parallelism., , , , and . SoCC, page 403-418. ACM, (2022)PAI-FCNN: FPGA Based CNN Inference System., , , , , , , , , and 1 other author(s). FPGA, page 184. ACM, (2019)FusionStitching: Boosting Memory Intensive Computations for Deep Learning Workloads., , , , , , , , and . CoRR, (2020)HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis., , , , , and . EuroSys, page 524-541. ACM, (2024)