Author of the publication

A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks.

, , , , , , , , , and . MICRO, page 175-188. IEEE Computer Society, (2018)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network., , , , , , and . ISCA, page 764-775. IEEE Computer Society, (2018)Accelerating String-key Learned Index Structures via Memoization-based Incremental Training., , , , , and . Proc. VLDB Endow., 17 (8): 1802-1815 (April 2024)DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics., , , , , , , , and . CoRR, (2024)TABLA: A unified template-based framework for accelerating statistical machine learning., , , , , , and . HPCA, page 14-26. IEEE Computer Society, (2016)Neural acceleration for GPU throughput processors., , , , and . MICRO, page 482-493. ACM, (2015)From high-level deep neural models to FPGAs., , , , , , , and . MICRO, page 17:1-17:12. IEEE Computer Society, (2016)Locality-aware dynamic VM reconfiguration on MapReduce clouds., , , , and . HPDC, page 27-36. ACM, (2012)Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning., , , , , and . CoRR, (2021)NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing., , , , , , , , and . ASPLOS (3), page 722-737. ACM, (2024)FlexJava: language support for safe and modular approximate programming., , , , and . ESEC/SIGSOFT FSE, page 745-757. ACM, (2015)