Author of the publication

A 95.6-TOPS/W Deep Learning Inference Accelerator With Per-Vector Scaled 4-bit Quantization in 5 nm.

, , , , , , , , and . IEEE J. Solid State Circuits, 58 (4): 1129-1141 (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Area-efficient pipelining for FPGA-targeted high-level synthesis., , , and . DAC, page 157:1-157:6. ACM, (2015)A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation., , and . FPGA, page 137-146. ACM, (2018)Accelerating Chip Design With Machine Learning., , , , , , , , , and 1 other author(s). IEEE Micro, 40 (6): 23-32 (2020)VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference., , , , , and . CoRR, (2021)Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests., , , , and . IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 36 (11): 1817-1830 (2017)Enabling adaptive loop pipelining in high-level synthesis., , , and . ACSSC, page 131-135. IEEE, (2017)Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers., , , , and . DAC, page 469-474. IEEE, (2021)Efficient Transformer Inference with Statically Structured Sparse Attention., , , and . DAC, page 1-6. IEEE, (2023)High-level synthesis with timing-sensitive information flow enforcement., , , and . ICCAD, page 88. ACM, (2018)Mapping-Aware Constrained Scheduling for LUT-Based FPGAs., , , and . FPGA, page 190-199. ACM, (2015)