Author of the publication

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions.

, , , , and . AAAI, page 19071-19079. AAAI Press, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

How Well Do Vision Models Encode Diagram Attributes?, , , , , , and . ACL (Student Research Workshop), page 564-575. Association for Computational Linguistics, (2024)InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions., , , , and . AAAI, page 19071-19079. AAAI Press, (2024)Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection., , , , and . CoRR, (2024)Different Modal Stereo: Simultaneous Estimation of Stereo Image Disparity and Modality Translation., , and . VISIGRAPP (4: VISAPP), page 554-560. SCITEPRESS, (2020)VisualMRC: Machine Reading Comprehension on Document Images., , and . AAAI, page 13878-13888. AAAI Press, (2021)3D Pose-Based Temporal Action Segmentation for Figure Skating: A Fine-Grained and Jump Procedure-Aware Annotation Approach., , and . MMSports@MM, page 17-26. ACM, (2024)Automatic Edge Error Judgment in Figure Skating Using 3D Pose Estimation from Inertial Sensors., , , and . GCCE, page 1099-1100. IEEE, (2023)Pseudo-label based unsupervised fine-tuning of a monocular 3D pose estimation model for sports motions., , , and . CVPR Workshops, page 3315-3324. IEEE, (2024)SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images., , , , , and . AAAI, page 13636-13645. AAAI Press, (2023)Automatic Edge Error Judgment in Figure Skating Using 3D Pose Estimation from a Monocular Camera and IMUs., , , and . MMSports@MM, page 41-48. ACM, (2023)