Author of the publication

Coherent Multi-sentence Video Description with Variable Level of Detail.

, , , , , and . GCPR, volume 8753 of Lecture Notes in Computer Science, page 184-195. Springer, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Efficient Lifelong Learning with A-GEM., , , and . ICLR (Poster), OpenReview.net, (2019)High-Level Fusion of Depth and Intensity for Pedestrian Classification., , and . DAGM-Symposium, volume 5748 of Lecture Notes in Computer Science, page 101-110. Springer, (2009)FLAVA: A Foundational Language And Vision Alignment Model., , , , , , and . CVPR, page 15617-15629. IEEE, (2022)Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding., , , , , and . EMNLP, page 457-468. The Association for Computational Linguistics, (2016)Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly., , , , , , and . ECCV (36), volume 13696 of Lecture Notes in Computer Science, page 148-166. Springer, (2022)Improving Selective Visual Question Answering by Learning from Your Peers., , , , , , , and . CVPR, page 24049-24059. IEEE, (2023)Modeling Relationships in Referential Expressions with Compositional Modular Networks., , , , and . CVPR, page 4418-4427. IEEE Computer Society, (2017)The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval., , , and . CoRR, (2024)Graph-Based Global Reasoning Networks., , , , , and . CoRR, (2018)TextCaps: A Dataset for Image Captioning with Reading Comprehension., , , and . ECCV (2), volume 12347 of Lecture Notes in Computer Science, page 742-758. Springer, (2020)