Artikel in einem Konferenzbericht,

Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated Users

T. Breuer, und M. Maistro.
Proceedings of the 2nd ACM Conference on Reproducibility and Replicability, Seite 25–29. New York, NY, USA, Association for Computing Machinery, (2024)
DOI: 10.1145/3641525.3663619

Zusammenfassung

Reproducibility is a fundamental part of scientific progress. Compared to other scientific fields, computational sciences are privileged as experimental setups can be preserved with ease, and regression experiments allow the validation of computational results by bitwise similarity. When evaluating information access systems, the system users are often considered in the experiments, be it explicit as part of user studies or implicit as part of evaluation measures. Usually, system-oriented Information Retrieval (IR) experiments are evaluated with effectiveness measurements and batches of multiple queries. Successful reproduction of an Information Retrieval (IR) system is often determined by how well it approximates the averaged effectiveness of the original (reproduced) system. Earlier work suggests that the na\"ıve comparison of average effectiveness hides differences that exist between the original and reproduced systems. Most importantly, such differences can affect the recipients of the retrieval results, i.e., the system users. To this end, this work sheds light on what implications for users may be neglected when a system-oriented Information Retrieval (IR) experiment is prematurely considered reproduced. Based on simulated reimplementations with comparable effectiveness as the reference system, we show what differences are hidden behind averaged effectiveness scores. We discuss possible future directions and consider how these implications could be addressed with user simulations.

BibTeX-Schlüssel: 10.1145/3641525.3663619
Eintragstyp: inproceedings
Adresse: New York, NY, USA
Buchtitel: Proceedings of the 2nd ACM Conference on Reproducibility and Replicability
Jahr: 2024
Seiten: 25–29
Verlag: Association for Computing Machinery
Reihe: ACM REP '24
isbn: 9798400705304
numpages: 5
pdf: https://dl.acm.org/doi/pdf/10.1145/3641525.3663619
location: Rennes, France
DOI: 10.1145/3641525.3663619
URL: https://doi.org/10.1145/3641525.3663619

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

%0 Conference Paper %1 10.1145/3641525.3663619 %A Breuer, Timo %A Maistro, Maria %B Proceedings of the 2nd ACM Conference on Reproducibility and Replicability %C New York, NY, USA %D 2024 %I Association for Computing Machinery %K access breuer information myown reproducibility resire system users %P 25–29 %R 10.1145/3641525.3663619 %T Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated Users %U https://doi.org/10.1145/3641525.3663619 %X Reproducibility is a fundamental part of scientific progress. Compared to other scientific fields, computational sciences are privileged as experimental setups can be preserved with ease, and regression experiments allow the validation of computational results by bitwise similarity. When evaluating information access systems, the system users are often considered in the experiments, be it explicit as part of user studies or implicit as part of evaluation measures. Usually, system-oriented Information Retrieval (IR) experiments are evaluated with effectiveness measurements and batches of multiple queries. Successful reproduction of an Information Retrieval (IR) system is often determined by how well it approximates the averaged effectiveness of the original (reproduced) system. Earlier work suggests that the na\"ıve comparison of average effectiveness hides differences that exist between the original and reproduced systems. Most importantly, such differences can affect the recipients of the retrieval results, i.e., the system users. To this end, this work sheds light on what implications for users may be neglected when a system-oriented Information Retrieval (IR) experiment is prematurely considered reproduced. Based on simulated reimplementations with comparable effectiveness as the reference system, we show what differences are hidden behind averaged effectiveness scores. We discuss possible future directions and consider how these implications could be addressed with user simulations. %@ 9798400705304

@inproceedings{10.1145/3641525.3663619, abstract = {Reproducibility is a fundamental part of scientific progress. Compared to other scientific fields, computational sciences are privileged as experimental setups can be preserved with ease, and regression experiments allow the validation of computational results by bitwise similarity. When evaluating information access systems, the system users are often considered in the experiments, be it explicit as part of user studies or implicit as part of evaluation measures. Usually, system-oriented Information Retrieval (IR) experiments are evaluated with effectiveness measurements and batches of multiple queries. Successful reproduction of an Information Retrieval (IR) system is often determined by how well it approximates the averaged effectiveness of the original (reproduced) system. Earlier work suggests that the na\"{\i}ve comparison of average effectiveness hides differences that exist between the original and reproduced systems. Most importantly, such differences can affect the recipients of the retrieval results, i.e., the system users. To this end, this work sheds light on what implications for users may be neglected when a system-oriented Information Retrieval (IR) experiment is prematurely considered reproduced. Based on simulated reimplementations with comparable effectiveness as the reference system, we show what differences are hidden behind averaged effectiveness scores. We discuss possible future directions and consider how these implications could be addressed with user simulations.}, added-at = {2024-09-12T11:49:36.000+0200}, address = {New York, NY, USA}, author = {Breuer, Timo and Maistro, Maria}, biburl = {https://www.bibsonomy.org/bibtex/2d47840304affef6b266a3afea4adced4/irgroup_thkoeln}, booktitle = {Proceedings of the 2nd ACM Conference on Reproducibility and Replicability}, doi = {10.1145/3641525.3663619}, interhash = {1ab684f4ad302bf4cfa766beaab8eb7e}, intrahash = {d47840304affef6b266a3afea4adced4}, isbn = {9798400705304}, keywords = {access breuer information myown reproducibility resire system users}, location = {Rennes, France}, numpages = {5}, pages = {25–29}, pdf = {https://dl.acm.org/doi/pdf/10.1145/3641525.3663619}, publisher = {Association for Computing Machinery}, series = {ACM REP '24}, timestamp = {2024-09-12T11:49:36.000+0200}, title = {Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated Users}, url = {https://doi.org/10.1145/3641525.3663619}, year = 2024 }

BibSonomy

Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated Users

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf