From post

PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance.

, , , , , , и . ICML, том 162 из Proceedings of Machine Learning Research, стр. 26809-26823. PMLR, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Homotopy Parametric Simplex Method for Sparse Learning., , , и . CoRR, (2017)On Computation and Generalization of Generative Adversarial Networks under Spectrum Control., , , , , и . ICLR (Poster), OpenReview.net, (2019)Spatial Resolution Enhancement of Remote Sensing Hyperspectral Images With Localized Spatial-Spectral Dictionary Pair., , , и . IEEE Access, (2020)On Generalization Bounds of a Family of Recurrent Neural Networks., , и . AISTATS, том 108 из Proceedings of Machine Learning Research, стр. 1233-1243. PMLR, (2020)DiP-GNN: Discriminative Pre-Training of Graph Neural Networks., , , , , и . CoRR, (2022)Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult., , , и . CoRR, (2023)Differentially Private Estimation of Hawkes Process., , , и . CoRR, (2022)Misspecified nonconvex statistical optimization for sparse phase retrieval., , , , , и . Math. Program., 176 (1-2): 545-571 (2019)LightToken: A Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models., , , , , , , , , и 1 other автор(ы). KDD, стр. 2302-2313. ACM, (2023)QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction., , , , , , , , , и . CIKM, стр. 4362-4372. ACM, (2021)