Author of the publication

Just Ask: Learning to Answer Questions from Millions of Narrated Videos.

, , , , and . ICCV, page 1666-1677. IEEE, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Just Ask: Learning to Answer Questions from Millions of Narrated Videos., , , , and . CoRR, (2020)Large-scale Learning from Video and Natural Language. (Apprentissage vidéo et langage naturel à grande échelle).. PSL Research University, Paris, France, (2020)Zorro: the masked multimodal transformer., , , , , , , , , and 1 other author(s). CoRR, (2023)Perception Test: A Diagnostic Benchmark for Multimodal Video Models., , , , , , , , , and 14 other author(s). NeurIPS, (2023)Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning., , , , , , , and . CVPR, page 10714-10726. IEEE, (2023)Flamingo: a Visual Language Model for Few-Shot Learning., , , , , , , , , and 17 other author(s). NeurIPS, (2022)Zero-Shot Video Question Answering via Frozen Bidirectional Language Models., , , , and . NeurIPS, (2022)End-to-End Learning of Visual Representations from Uncurated Instructional Videos., , , , , and . CoRR, (2019)Just Ask: Learning to Answer Questions from Millions of Narrated Videos., , , , and . ICCV, page 1666-1677. IEEE, (2021)Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers., , , , and . CVPR, page 9826-9836. Computer Vision Foundation / IEEE, (2021)