Author of the publication

Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model.

, , , , , and . COLING, page 6883-6893. International Committee on Computational Linguistics, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Adversarial Dropout for Supervised and Semi-supervised Learning., , , and . CoRR, (2017)Contrastive Learning for Knowledge Tracing., , , , and . WWW, page 2330-2338. ACM, (2022)Adversarial Dropout for Supervised and Semi-Supervised Learning., , , and . AAAI, page 3917-3924. AAAI Press, (2018)SWAD: Domain Generalization by Seeking Flat Minima., , , , , , and . NeurIPS, page 22405-22418. (2021)Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching., , and . AAAI, page 7945-7952. AAAI Press, (2021)Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model., , , , , and . COLING, page 6883-6893. International Committee on Computational Linguistics, (2020)Character Region Attention for Text Spotting., , , , , , and . ECCV (29), volume 12374 of Lecture Notes in Computer Science, page 504-521. Springer, (2020)BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents., , , , , and . AAAI, page 10767-10775. AAAI Press, (2022)Supervised Dynamic Topic Models for Associative Topic Extraction with A Numerical Time Series., , and . TM@CIKM, page 49-54. ACM, (2015)Adversarial Dropout for Recurrent Neural Networks., , , , and . AAAI, page 4699-4706. AAAI Press, (2019)