copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Perspectives on Large Language Models for Relevance Judgment

G. Faggioli, {. Clarke, G. Demartini, M. Hagen, C. Hauff, N. Kando, E. Kanoulas, M. Potthast, B. Stein, H. Wachsmuth, and L. Dietz. ICTIR '23, page 39--50. Association for Computing Machinery, Inc, (Aug 9, 2023)Funding Information: This material is based upon work supported by the National Science Foundation under Grant No. 1846017. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. ; 9th ACM SIGIR International Conference on the Theory of Information Retrieval : ICTIR 2023 ; Conference date: 23-07-2023 Through 23-07-2023.
DOI: 10.48550/arXiv.2304.09161

Abstract

When asked, large language models∼(LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems. In this perspectives paper, we discuss possible ways for∼LLMs to support relevance judgments along with concerns and issues that arise. We devise a human - machine collaboration spectrum that allows to categorize different relevance judgment strategies, based on how much humans rely on machines. For the extreme point of 'fully automated judgments', we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing opposing perspectives for and against the use of∼LLMs for automatic relevance judgments, and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR∼researchers.

Links and resources

BibTeX key: 7582791698b742b2a1dffcd353510682
entry type: inproceedings
booktitle: ICTIR '23
year: 2023
month: aug
day: 9
pages: 39--50
publisher: Association for Computing Machinery, Inc
language: English
DOI: 10.48550/arXiv.2304.09161
note: Funding Information: This material is based upon work supported by the National Science Foundation under Grant No. 1846017. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. ; 9th ACM SIGIR International Conference on the Theory of Information Retrieval : ICTIR 2023 ; Conference date: 23-07-2023 Through 23-07-2023

@ail3s's tags highlighted

Cite this publication

@inproceedings{7582791698b742b2a1dffcd353510682, abstract = {When asked, large language models∼(LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems. In this perspectives paper, we discuss possible ways for∼LLMs to support relevance judgments along with concerns and issues that arise. We devise a human - machine collaboration spectrum that allows to categorize different relevance judgment strategies, based on how much humans rely on machines. For the extreme point of 'fully automated judgments', we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing opposing perspectives for and against the use of∼LLMs for automatic relevance judgments, and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR∼researchers.}, added-at = {2024-02-13T13:22:52.000+0100}, author = {Faggioli, Guglielmo and Clarke, {Charles L.A.} and Demartini, Gianluca and Hagen, Matthias and Hauff, Claudia and Kando, Noriko and Kanoulas, Evangelos and Potthast, Martin and Stein, Benno and Wachsmuth, Henning and Dietz, Laura}, biburl = {https://www.bibsonomy.org/bibtex/2e15a532abc177cb7f6ddca12740f905c/ail3s}, booktitle = {ICTIR '23}, day = 9, doi = {10.48550/arXiv.2304.09161}, interhash = {ee7c0af44c4c4c605f9f7449aee52dae}, intrahash = {e15a532abc177cb7f6ddca12740f905c}, keywords = {#bpa #sys:relevantfor:l3s myown nlp}, language = {English}, month = aug, note = {Funding Information: This material is based upon work supported by the National Science Foundation under Grant No. 1846017. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. ; 9th ACM SIGIR International Conference on the Theory of Information Retrieval : ICTIR 2023 ; Conference date: 23-07-2023 Through 23-07-2023}, pages = {39--50}, publisher = {Association for Computing Machinery, Inc}, timestamp = {2024-02-27T12:17:48.000+0100}, title = {Perspectives on Large Language Models for Relevance Judgment}, year = 2023 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Perspectives on Large Language Models for Relevance Judgment

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Perspectives on Large Language Models for Relevance Judgment

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Perspectives on Large Language Models for Relevance Judgment

Comments and Reviews
(0)