Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Unifying Multimodal Transformer for Bi-directional Image and Text Generation., , , and . ACM Multimedia, page 1138-1147. ACM, (2021)Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning., , , , , and . CVPR, page 12976-12985. Computer Vision Foundation / IEEE, (2021)Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training., , , , , , and . NeurIPS, page 4514-4528. (2021)LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking., , , , and . ACM Multimedia, page 4083-4091. ACM, (2022)Be Specific, Be Clear: Bridging Machine and Human Captions by Scene-Guided Transformer., , and . MMPT@ICMR, page 4-13. ACM, (2021)Decoupling Localization and Classification in Single Shot Temporal Action Detection., , and . ICME, page 1288-1293. IEEE, (2019)TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering., , , , , and . CoRR, (2023)Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training., , , , , , and . CoRR, (2021)A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation., , , and . ACM Multimedia, page 2792-2794. ACM, (2021)