Author of the publication

TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training.

, , , , , , and . AAAI, page 2489-2497. AAAI Press, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Exploring Global Diversity and Local Context for Video Summarization., , , , , , and . IEEE Access, (2022)mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration., , , , , , , , , and . CoRR, (2023)Temporal Cue Guided Video Highlight Detection with Low-Rank Audio-Visual Fusion., , , , , , and . ICCV, page 7930-7939. IEEE, (2021)COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment., , , , , , , , , and . ACM Multimedia, page 4480-4491. ACM, (2023)UniQRNet: Unifying Referring Expression Grounding and Segmentation with QRNet., , , , , , , , , and 1 other author(s). ACM Trans. Multim. Comput. Commun. Appl., 20 (8): 246:1-246:28 (August 2024)TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training., , , , , , and . AAAI, page 2489-2497. AAAI Press, (2024)Learning Trajectory-Word Alignments for Video-Language Tasks., , , , , , , , , and . ICCV, page 2504-2514. IEEE, (2023)Transforming Visual Scene Graphs to Image Captions., , , , , , , , , and . ACL (1), page 12427-12440. Association for Computational Linguistics, (2023)Evaluation and Analysis of Hallucination in Large Vision-Language Models., , , , , , , , , and 2 other author(s). CoRR, (2023)mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM., , , , , , , , , and . ACM Multimedia, page 9365-9367. ACM, (2023)