Author of the publication

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

, , , and . Advances in Neural Information Processing Systems, 32, Curran Associates, Inc., (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning., , , and . CoRR, (2016)Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition., , , , and . CoRL, volume 87 of Proceedings of Machine Learning Research, page 63-80. PMLR, (2018)Self-Monitoring Navigation Agent via Auxiliary Progress Estimation., , , , , , and . ICLR (Poster), OpenReview.net, (2019)A Simple Long-Tailed Recognition Baseline via Vision-Language Model., , , , , , , and . CoRR, (2021)Transferable Feature Learning on Graphs Across Visual Domains., , , and . ICME, page 1-6. IEEE, (2021)Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning., , , and . CVPR, page 3242-3250. IEEE Computer Society, (2017)Container: Context Aggregation Network., , , , and . CoRR, (2021)Multi-Modal Answer Validation for Knowledge-Based VQA., , , and . AAAI, page 2712-2721. AAAI Press, (2022)UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks., , , , and . ICLR, OpenReview.net, (2023)MERLOT RESERVE: Neural Script Knowledge through Vision and Language and Sound., , , , , , , , , and . CVPR, page 16354-16366. IEEE, (2022)