Author of the publication

Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning.

, , , , , , and . ICPP, page 161-170. IEEE Computer Society, (2017)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Networking and communication challenges for post-exascale systems., , and . Frontiers Inf. Technol. Electron. Eng., 19 (10): 1230-1235 (2018)Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters., , , and . CoRR, (2021)Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version., , , , , , and . CoRR, (2023)High-Performance Adaptive MPI Derived Datatype Communication for Modern Multi-GPU Systems., , , , and . HiPC, page 267-276. IEEE, (2019)Performance Evaluation of MPI Libraries on GPU-Enabled OpenPOWER Architectures: Early Experiences., , , and . ISC Workshops, volume 11887 of Lecture Notes in Computer Science, page 361-378. Springer, (2019)NV-group: link-efficient reduction for distributed deep learning on modern dense GPU systems., , , , , and . ICS, page 6:1-6:12. ACM, (2020)Analyzing and Understanding the Impact of Interconnect Performance on HPC, Big Data, and Deep Learning Applications: A Case Study with InfiniBand EDR and HDR., , , , and . IPDPS Workshops, page 869-878. IEEE, (2020)Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework., , , and . CLUSTER, page 329-336. IEEE Computer Society, (2012)RDMA over Ethernet - A preliminary study., , , and . CLUSTER, page 1-9. IEEE Computer Society, (2009)High-Performance Design of HBase with RDMA over InfiniBand., , , , , , , , and . IPDPS, page 774-785. IEEE Computer Society, (2012)