Author of the publication

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.

, , , , , , and . CVPR, page 6077-6086. Computer Vision Foundation / IEEE Computer Society, (2018)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Hand parsing for fine-grained recognition of human grasps in monocular images., , and . IROS, page 5052-5058. IEEE, (2015)Unshuffling Data for Improved Generalization in Visual Question Answering., , and . ICCV, page 1397-1407. IEEE, (2021)Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models., , , and . ICCV, page 2105-2114. IEEE, (2021)ID and OOD Performance Are Sometimes Inversely Correlated on Real-world Datasets., , , and . CoRR, (2022)Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks., , , , and . CoRR, (2023)Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization., , , and . CVPR, page 16740-16751. IEEE, (2022)Vision-Language Pretraining: Current Trends and the Future., , and . ACL (tutorial), page 38-43. Association for Computational Linguistics, (2022)Segmentation of Dynamic Scenes with Distributions of Spatiotemporally Oriented Energies., and . BMVC, BMVA Press, (2014)Visual Question Answering as a Meta Learning Task., and . ECCV (15), volume 11219 of Lecture Notes in Computer Science, page 229-245. Springer, (2018)Learning to Extract Motion from Videos in Convolutional Neural Networks., and . ACCV (5), volume 10115 of Lecture Notes in Computer Science, page 412-428. Springer, (2016)