From post

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning.

, , , , , , , , и . ACL (1), стр. 14507-14525. Association for Computational Linguistics, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation., , , , , , , , , и 1 other автор(ы). CoRR, (2023)Trace Controlled Text to Image Generation., , , , , , и . ECCV (36), том 13696 из Lecture Notes in Computer Science, стр. 59-75. Springer, (2022)KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation., , , , , и . NAACL-HLT (Findings), стр. 1589-1600. Association for Computational Linguistics, (2022)ReCo: Region-Controlled Text-to-Image Generation., , , , , , , , , и 1 other автор(ы). CVPR, стр. 14246-14255. IEEE, (2023)NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis., , , , , , , , и . NeurIPS, (2022)BridgeTower: Building Bridges between Encoders in Vision-Language Representation Learning., , , , , и . AAAI, стр. 10637-10647. AAAI Press, (2023)GEM: A General Evaluation Benchmark for Multimodal Tasks., , , , , , , , , и . ACL/IJCNLP (Findings), том ACL/IJCNLP 2021 из Findings of ACL, стр. 2594-2603. Association for Computational Linguistics, (2021)ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning., , , , , , , , и . ACL (1), стр. 14507-14525. Association for Computational Linguistics, (2023)Sequential Visual Reasoning for Visual Question Answering., , , и . CCIS, стр. 410-415. IEEE, (2018)LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models, , , и . (2023)cite arxiv:2309.09506.