Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

On the curvature of the loss landscape., , , and . CoRR, (2023)Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers., , , , and . CoRR, (2023)Transformer Fusion with Optimal Transport., , , , , and . CoRR, (2023)GLOSS: Generative Latent Optimization of Sentence Representations., , and . CoRR, (2019)SL - FII: Syntactic and Lexical Constraints with Frequency based Iterative Improvement for Disease Mention Recognition in News Headlines., , , , and . BAI@IJCAI, volume 1718 of CEUR Workshop Proceedings, page 28-34. CEUR-WS.org, (2016)Model Fusion via Optimal Transport., and . CoRR, (2019)Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers (Student Abstract)., , , , and . AAAI, page 23477-23479. AAAI Press, (2024)The Hessian perspective into the Nature of Convolutional Neural Networks., , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 31930-31968. PMLR, (2023)Model Fusion via Optimal Transport., and . NeurIPS, (2020)Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse., , , , , and . NeurIPS, (2022)