Author of the publication

End-to-End Learning of Visual Representations From Uncurated Instructional Videos.

, , , , , and . CVPR, page 9876-9886. Computer Vision Foundation / IEEE, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Just Ask: Learning to Answer Questions from Millions of Narrated Videos., , , , and . CoRR, (2020)Large-scale Learning from Video and Natural Language. (Apprentissage vidéo et langage naturel à grande échelle).. PSL Research University, Paris, France, (2020)Perception Test: A Diagnostic Benchmark for Multimodal Video Models., , , , , , , , , and 14 other author(s). NeurIPS, (2023)Zorro: the masked multimodal transformer., , , , , , , , , and 1 other author(s). CoRR, (2023)Flamingo: a Visual Language Model for Few-Shot Learning., , , , , , , , , and 17 other author(s). NeurIPS, (2022)Zero-Shot Video Question Answering via Frozen Bidirectional Language Models., , , , and . NeurIPS, (2022)Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning., , , , , , , and . CVPR, page 10714-10726. IEEE, (2023)End-to-End Learning of Visual Representations from Uncurated Instructional Videos., , , , , and . CoRR, (2019)End-to-End Learning of Visual Representations From Uncurated Instructional Videos., , , , , and . CVPR, page 9876-9886. Computer Vision Foundation / IEEE, (2020)Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers., , , , and . CVPR, page 9826-9836. Computer Vision Foundation / IEEE, (2021)