Author of the publication

Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud.

, , , , , , , and . FCCM, page 102-110. IEEE, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Sgap: Towards Efficient Sparse Tensor Algebra Compilation for GPU., , , , , , , , and . CoRR, (2022)NTGAT: A Graph Attention Network Accelerator with Runtime Node Tailoring., , , , , and . ASP-DAC, page 645-650. ACM, (2023)A Point Transformer Accelerator with Fine-Grained Pipelines and Distribution-Aware Dynamic FPS., , , , , and . ICCAD, page 1-9. IEEE, (2023)GraphSAR: a sparsity-aware processing-in-memory architecture for large-scale graph processing on ReRAMs., , , , and . ASP-DAC, page 120-126. ACM, (2019)Exploiting Online Locality and Reduction Parallelism for Sampled Dense Matrix Multiplication on GPUs., , , , and . ICCD, page 567-574. IEEE, (2021)An Order Sampling Processing-in-Memory Architecture for Approximate Graph Pattern Mining., , , , and . ACM Great Lakes Symposium on VLSI, page 357-362. ACM, (2020)GraphSDH: A General Graph Sampling Framework with Distribution and Hierarchy., , , and . HPEC, page 1-7. IEEE, (2020)A one-for-all and o(v log(v ))-cost solution for parallel merge style operations on sorted key-value arrays., , , , , , and . ASPLOS, page 669-682. ACM, (2022)Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better., , , , , , , , , and . CoRR, (2024)HetHub: A Heterogeneous distributed hybrid training system for large-scale models., , , , , , , , , and 3 other author(s). CoRR, (2024)