Author of the publication

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks.

, , , , , , , , , , , and . ECCV (30), volume 12375 of Lecture Notes in Computer Science, page 121-137. Springer, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging., , , , , , , , , and 16 other author(s). CoRR, (2024)An Universal Image Attractiveness Ranking Framework., , , , , and . WACV, page 657-665. IEEE, (2019)Stacked Cross Attention for Image-Text Matching., , , , and . ECCV (4), volume 11208 of Lecture Notes in Computer Science, page 212-228. Springer, (2018)Image Scene Graph Generation (SGG) Benchmark., , , , , and . CoRR, (2021)Electronic Structure Models: Solution Theory, Linear Scaling Methods, and Stability Analysis. University of California, San Diego, USA, (2014)MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark., , , , , , , and . WACV, page 4849-4858. IEEE, (2023)Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks., , , , , , , , , and 2 other author(s). ECCV (30), volume 12375 of Lecture Notes in Computer Science, page 121-137. Springer, (2020)Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks., , , , , , , , and . CVPR, page 4818-4829. IEEE, (2024)Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks., , , , , , , , and . CoRR, (2023)ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models., , , , , , , , , and 1 other author(s). NeurIPS, (2022)