Author of the publication

POPA: Expressing High and Portable Performance across Spatial and Vector Architectures for Tensor Computations.

, , , , , and . FPGA, page 199-210. ACM, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Quantitative identification dust and sand storm using MODIS data., , and . IGARSS, page 3630-3633. IEEE, (2005)CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs., , , and . HPDC, page 79-92. ACM, (2017)A New Fault Detection Method of Induction Motor., and . AICI (2), volume 6320 of Lecture Notes in Computer Science, page 1-8. Springer, (2010)A component-driven distributed framework for real-time video dehazing., , , , , and . Multimedia Tools Appl., 77 (9): 11259-11276 (2018)An efficient compiler framework for cache bypassing on GPUs., , , and . ICCAD, page 516-523. IEEE, (2013)Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices., and . ICCAD, page 1-6. ACM, (2019)FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow., , , , , , and . IEEE Trans. Very Large Scale Integr. Syst., 24 (6): 2220-2233 (2016)A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM., , , , , , , , and . CoRR, (2018)Automatic Generation of Spatial Accelerator for Tensor Algebra., , , and . IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 42 (6): 1898-1911 (June 2023)Drug target interaction prediction via multi-task co-attention., , , , and . Int. J. Data Min. Bioinform., 24 (2): 160-176 (2020)