Author of the publication

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

, , and . Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, (2021)
DOI: 10.18653/v1/2021.emnlp-main.49

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Multi-dimensional Recurrent Neural Networks, , and . Proc. International Conf. on Artificial Neural Networks (ICANN-2007), 4668, page 865--873. Springer, Berlin, (2007)Solving Deep Memory POMDPs with Recurrent Policy Gradients., , , and . ICANN (1), volume 4668 of Lecture Notes in Computer Science, page 697-706. Springer, (2007)Modeling Non-Linear Dynamical Systems with Evolino, , and . Proceedings of the Genetic Evolutionary Computation Conference (GECCO-05), Berlin; New York, Springer-Verlag, (2005)A General Method for Incremental Self-Improvement and Multi-agent Learning in Unrestricted Environments. Evolutionary Computation: Theory and Applications, Scientific Publishing Company, (1996)Recurrent Policy Gradients, , , and . Journal of Algorithms, (in press)A possibility for implementing curiosity and boredom in model-building neural controllers. Proceedings of the International Conference on Simulation of Adaptive Behavior, page 222-227. MIT Press/Bradford Books, (1991)Co-Evolving Recurrent Neurons Learn Deep Memory POMDPs, and . 17--04. IDSIA, Lugano, Switzerland, (December 2004)Learning to predict through Probabilistic Incremental Program Evolution and automatic task decomposition, and . Technical Report, IDSIA-11-98. IDSIA, Switzerland, (1998)H-PIPE: Facilitating Hierarchical Program Evolution Through Skip Nodes, and . Technical Report, IDSIA-8-98. IDSIA, Switzerland, (1998)The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers, , and . Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, (2021)