From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed., , , , и . HIPC, стр. 272-281. IEEE, (2022)OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training., , , , и . HiPC, стр. 143-152. IEEE, (2018)Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning., , , , , , и . ICPP, стр. 161-170. IEEE Computer Society, (2017)An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures., , и . MLHPC@SC, стр. 8:1-8:8. ACM, (2017)Intercloud message exchange middleware., , , и . ICUIMC, стр. 79:1-79:7. ACM, (2012)MCR-DL: Mix-and-Match Communication Runtime for Deep Learning., , , , , , , и . IPDPS, стр. 996-1006. IEEE, (2023)High performance distributed deep learning: a beginner's guide., , и . PPoPP, стр. 452-454. ACM, (2019)DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale., , , , , , , и . ICML, том 162 из Proceedings of Machine Learning Research, стр. 18332-18346. PMLR, (2022)Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?, , , и . EuroMPI, стр. 2:1-2:9. ACM, (2018)Efficient Training of Semantic Image Segmentation on Summit using Horovod and MVAPICH2-GDR., , , , и . IPDPS Workshops, стр. 1015-1023. IEEE, (2020)