Author of the publication

Zero-Shot Translation of Attention Patterns in VQA Models to Natural Language.

, , , and . DAGM, volume 14264 of Lecture Notes in Computer Science, page 378-393. Springer, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Sight to Sound: An End-to-End Approach for Visual Piano Transcription., , , and . ICASSP, page 1838-1842. IEEE, (2020)Waffling around for Performance: Visual Classification with Random Words and Broad Concepts., , , , , and . ICCV, page 15700-15711. IEEE, (2023)Zero-Shot Translation of Attention Patterns in VQA Models to Natural Language., , , and . DAGM, volume 14264 of Lecture Notes in Computer Science, page 378-393. Springer, (2023)Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model., , , , , and . ICLR, OpenReview.net, (2024)A Sound Approach: Using Large Language Models to Generate Audio Descriptions for Egocentric Text-Audio Retrieval., , , , and . ICASSP, page 7300-7304. IEEE, (2024)Where and When: Space-Time Attention for Audio-Visual Explanations., , , and . CoRR, (2021)Self-supervised learning of a facial attribute embedding from video., , and . BMVC, page 302. BMVA Press, (2018)Audiovisual Generalised Zero-shot Learning with Cross-modal Attention and Language., , , and . CVPR, page 10543-10553. IEEE, (2022)Distilling Audio-Visual Knowledge by Compositional Contrastive Learning., , , , and . CVPR, page 7016-7025. Computer Vision Foundation / IEEE, (2021)Temporal and Cross-modal Attention for Audio-Visual Zero-Shot Learning., , , and . ECCV (20), volume 13680 of Lecture Notes in Computer Science, page 488-505. Springer, (2022)