Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Masked AutoDecoder is Effective Multi-Task Vision Generalist., , , , , and . CoRR, (2024)BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision., , , , , , , , , and 2 other author(s). CVPR, page 17830-17839. IEEE, (2023)InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks., , , , , , , , , and 5 other author(s). CoRR, (2023)OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text., , , , , , , , , and 30 other author(s). CoRR, (2024)Weakly Supervised Monocular 3D Detection with a Single-View Image., , , , and . CoRR, (2024)Scene as Occupancy., , , , , , , , , and 1 other author(s). CoRR, (2023)Exploring the Potential of Flexible 8-bit Format: Design and Algorithm., , , , , , , , and . CoRR, (2023)Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory, , , , , , , , , and 3 other author(s). (2023)Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications., , , , , , , , , and 3 other author(s). CoRR, (2024)VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks., , , , , , , , , and 3 other author(s). CoRR, (2024)