Author of the publication

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising.

, , , , , and . ACM Multimedia, page 5600-5608. ACM, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning., , , , , and . AAAI, page 8167-8174. AAAI Press, (2019)Flexible User Duplexing in Cell-Free Massive MIMO: A Deep Reinforcement Learning Approach., , , , , , and . ICCC, page 296-301. IEEE, (2022)Jointly Localizing and Describing Events for Dense Video Captioning., , , , and . CVPR, page 7492-7500. Computer Vision Foundation / IEEE Computer Society, (2018)Pointing Novel Objects in Image Captioning., , , , and . CVPR, page 12497-12506. Computer Vision Foundation / IEEE, (2019)Boosting Image Captioning with Attributes., , , , and . ICCV, page 4904-4912. IEEE Computer Society, (2017)Exploring Depth Information for Spatial Relation Recognition., , , , and . MIPR, page 279-284. IEEE, (2020)Contextual Transformer Networks for Visual Recognition., , , and . CoRR, (2021)Dual Vision Transformer., , , , , and . CoRR, (2022)Semantic-Conditional Diffusion Networks for Image Captioning., , , , , , and . CVPR, page 23359-23368. IEEE, (2023)CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising., , , , , and . ACM Multimedia, page 5600-5608. ACM, (2021)