From post

CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations.

, , и . EMNLP, стр. 9758-9767. Association for Computational Linguistics, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention., , , , , , , и . INTERSPEECH, стр. 231-235. ISCA, (2019)Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules., , и . NeurIPS, (2022)A Modern Self-Referential Weight Matrix That Learns to Modify Itself., , , и . ICML, том 162 из Proceedings of Machine Learning Research, стр. 9660-9677. PMLR, (2022)The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention., , и . ICML, том 162 из Proceedings of Machine Learning Research, стр. 9639-9659. PMLR, (2022)Linear Transformers Are Secretly Fast Weight Programmers., , и . ICML, том 139 из Proceedings of Machine Learning Research, стр. 9355-9366. PMLR, (2021)The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment., , , , , и . ICASSP, стр. 7839-7843. IEEE, (2020)Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions., , и . EMNLP, стр. 9455-9465. Association for Computational Linguistics, (2023)MoEUT: Mixture-of-Experts Universal Transformers., , , , и . CoRR, (2024)On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition., , , , , и . INTERSPEECH, стр. 3800-3804. ISCA, (2019)The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization., , и . ICLR, OpenReview.net, (2022)