Author of the publication

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.

, , , , , , , , , and . ICPP, page 34:1-34:12. ACM, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

The Deep Learning Compiler: A Comprehensive Survey., , , , , , , and . CoRR, (2020)Improving Thread-level Parallelism in GPUs Through Expanding Register File to Scratchpad Memory., , , and . ACM Trans. Archit. Code Optim., 15 (4): 48:1-48:24 (2019)Towards Optimized Streaming Tensor Completion on multiple GPUs., , , , , and . HPCC/DSS/SmartCity/DependSys, page 1123-1128. IEEE, (2022)Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors., , , , , , , , , and . ICPP, page 34:1-34:12. ACM, (2021)CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs., , , , , , , , , and 1 other author(s). SC, page 39:1-39:15. IEEE, (2022)StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs., , , , , and . IPDPS, page 875-885. IEEE, (2022)SpTFS: sparse tensor format selection for MTTKRP via deep learning., , , , , , , and . SC, page 18. IEEE/ACM, (2020)QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU., , , , , and . Parallel Comput., (2022)An optimized tensor completion library for multiple GPUs., , , , , and . ICS, page 417-430. ACM, (2021)Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs., , , , , and . IEEE Trans. Parallel Distributed Syst., 35 (1): 20-33 (January 2024)