Author of the publication

II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering.

, , , and . ACL (Findings), page 10698-10709. Association for Computational Linguistics, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs., , , , , , , , and . CoRR, (2024)GPT-4V(ision) is a Generalist Web Agent, if Grounded., , , , and . ICML, OpenReview.net, (2024)PreSTU: Pre-Training for Scene-Text Understanding., , , , , , and . CoRR, (2022)Dual-View Visual Contextualization for Web Navigation., , , , , and . CVPR, page 14445-14454. IEEE, (2024)Retrospective Analysis of EHR and Administrative Data for Drug Repurposing Hypothesis Evaluation in Melanoma., , , , and . CRI, AMIA, (2017)Revisiting Document Representations for Large-Scale Zero-Shot Learning., and . NAACL-HLT, page 3117-3128. Association for Computational Linguistics, (2021)ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback., , , and . CoRR, (2024)One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones., , , , , and . CVPR, page 15461-15470. IEEE, (2022)Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering., , , and . EMNLP (1), page 6346-6361. Association for Computational Linguistics, (2021)II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering., , , and . ACL (Findings), page 10698-10709. Association for Computational Linguistics, (2024)