Author of the publication

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.

, , , , , , , and . ICCAD, page 1-9. IEEE, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Accommodating Transformer onto FPGA: Coupling the Balanced Model Compression and FPGA-Implementation Optimization., , , , , and . ACM Great Lakes Symposium on VLSI, page 163-168. ACM, (2021)Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization., , , , , , , and . ICCAD, page 1-9. IEEE, (2021)RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference., , , , , , , , , and 4 other author(s). CoRR, (2023)MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training., , , , , , , , and . ASPLOS (2), page 683-698. ACM, (2024)AQ2PNN: Enabling Two-party Privacy-Preserving Deep Neural Network Inference with Adaptive Quantization., , , , , , , , and . MICRO, page 628-640. ACM, (2023)Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search., , , , , , , , , and 2 other author(s). CoRR, (2021)Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads., , , , , , and . ICML, OpenReview.net, (2024)An Automatic and Efficient BERT Pruning for Edge AI Systems., , , , , , , and . ISQED, page 1-6. IEEE, (2022)CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM., , , , , , , , , and 1 other author(s). ICCD, page 280-289. IEEE, (2022)Binary Complex Neural Network Acceleration on FPGA : (Invited Paper)., , , , , , , , , and 2 other author(s). ASAP, page 85-92. IEEE, (2021)