Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design., , , and . ICLR (Poster), OpenReview.net, (2018)Limits to Depth Efficiencies of Self-Attention., , , , and . NeurIPS, (2020)PMI-Masking: Principled masking of correlated spans., , , , , , and . ICLR, OpenReview.net, (2021)SenseBERT: Driving Some Sense into BERT., , , , , , , , and . ACL, page 4656-4667. Association for Computational Linguistics, (2020)Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks., , and . ICLR, OpenReview.net, (2023)Benefits of Depth for Long-Term Memory of Recurrent Networks., , and . ICLR (Workshop), OpenReview.net, (2018)Bridging Many-Body Quantum Physics and Deep Learning via Tensor Networks, , , and . (2018)cite arxiv:1803.09780.Parallel Context Windows for Large Language Models., , , , , , , , , and . ACL (1), page 6383-6402. Association for Computational Linguistics, (2023)Which transformer architecture fits my data? A vocabulary bottleneck in self-attention., , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 11170-11181. PMLR, (2021)The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design., , , , , and . ICLR, OpenReview.net, (2022)