Author of the publication

Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling.

, , , and . ISOCC, page 357-358. IEEE, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

SQWA: Stochastic Quantized Weight Averaging For Improving The Generalization Capability Of Low-Precision Deep Neural Networks., , and . ICASSP, page 8052-8056. IEEE, (2021)TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference., , , , and . SiPS, page 111-116. IEEE, (2021)A synchronization scheme for multi-carrier CDMA systems., and . ICC, page 1330-1334. IEEE, (1998)Time- and frequency-domain hybrid detection scheme for OFDM-CDMA systems., and . ICC, page 1531-1535. IEEE, (1998)Korean Tokenization for Beam Search Rescoring in Speech Recognition., , and . ICEIC, page 1-4. IEEE, (2023)Convolution-Based Attention Model With Positional Encoding For Streaming Speech Recognition On Embedded Devices., , and . SLT, page 30-37. IEEE, (2021)Architecture exploration of a programmable neural network processor for embedded systems., and . SAMOS, page 124-131. IEEE, (2016)An integrated hardware-software cosimulation environment for heterogeneous systems prototyping., , , , , , and . ASP-DAC, ACM, (1995)Structured sparse ternary weight coding of deep neural networks for efficient hardware implementations., and . SiPS, page 1-6. IEEE, (2017)On-Device End-to-end Speech Recognition with Multi-Step Parallel Rnns., , , and . SLT, page 376-381. IEEE, (2018)