Author of the publication

DRStencil: Exploiting Data Reuse within Low-order Stencil on GPU.

, , , , and . HPCC/DSS/SmartCity/DependSys, page 63-70. IEEE, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Efficient detection of silent data corruption in HPC applications with synchronization-free message verification., , , and . J. Supercomput., 78 (1): 1381-1408 (2022)SMGuard: A Flexible and Fine-Grained Resource Management Framework for GPUs., , , , , , and . IEEE Trans. Parallel Distributed Syst., 29 (12): 2849-2862 (2018)Performance-Aware Based Correlated Datasets Replication Strategy., , and . ISCTCS, volume 520 of Communications in Computer and Information Science, page 322-327. Springer, (2014)Energy Efficiency Evaluation of Workload Execution on Intel Xeon Phi Coprocessor., , , , and . ISCTCS, volume 426 of Communications in Computer and Information Science, page 268-275. Springer, (2013)swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture., , , , , , and . CoRR, (2019)Performance Evaluation and Analysis of Linear Algebra Kernels in the Prototype Tianhe-3 Cluster., , , , and . SCFA, volume 11416 of Lecture Notes in Computer Science, page 86-105. Springer, (2019)An optimized tensor completion library for multiple GPUs., , , , , and . ICS, page 417-430. ACM, (2021)Toward accelerated stencil computation by adapting tensor core unit on GPU., , , , , , and . ICS, page 28:1-28:12. ACM, (2022)SparkOT: Diagnosing Operation Level Inefficiency in Spark., , , , and . HPCC/SmartCity/DSS, page 692-699. IEEE, (2018)L-DAG: Enabling Loopy Workflow in Scientific Application with Automatic DAG Transformation., , , and . DASC/PiCom/DataCom/CyberSciTech, page 946-953. IEEE, (2019)