Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications., , , , , , , , , and 18 other author(s). CoRR, (2018)Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization., , , , , , , , , and 5 other author(s). OSDI, page 267-284. USENIX Association, (2022)With Shared Microexponents, A Little Shifting Goes a Long Way., , , , , , , , , and 12 other author(s). ISCA, page 83:1-83:13. ACM, (2023)A framework for low-communication 1-D FFT., , , and . Sci. Program., 21 (3-4): 181-195 (2013)Dynamic fine-grained sparse memory accesses., , , , and . MEMSYS, page 85-97. ACM, (2018)Wukong: Towards a Scaling Law for Large-Scale Recommendation., , , , , , , , , and 5 other author(s). CoRR, (2024)Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices., , , , , , , , and . SC, page 945-955. IEEE Computer Society, (2014)Versatile and scalable parallel histogram construction., , and . PACT, page 127-138. ACM, (2014)Fine-grain dynamic instruction placement for L0 scratch-pad memory., , and . CASES, page 137-146. ACM, (2010)HPC formulations of optimization algorithms for tensor completion., , and . Parallel Comput., (2018)