Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems
M. Walker, R. Passonneau, and J. Boland. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, page 515--522. Stroudsburg, PA, USA, Association for Computational Linguistics, (2001)
DOI: 10.3115/1073012.1073078
Abstract
This paper describes the application of the PARADISE evaluation framework to the corpus of 662 human-computer dialogues collected in the June 2000 Darpa Communicator data collection. We describe results based on the standard logfile metrics as well as results based on additional qualitative metrics derived using the DATE dialogue act tagging scheme. We show that performance models derived via using the standard metrics can account for 37% of the variance in user satisfaction, and that the addition of DATE metrics improved the models by an absolute 5%.
Description
Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems
%0 Conference Paper
%1 Walker:2001:QQE:1073012.1073078
%A Walker, Marilyn A.
%A Passonneau, Rebecca
%A Boland, Julie E.
%B Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
%C Stroudsburg, PA, USA
%D 2001
%I Association for Computational Linguistics
%K course hci speech
%P 515--522
%R 10.3115/1073012.1073078
%T Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems
%U http://dx.doi.org/10.3115/1073012.1073078
%X This paper describes the application of the PARADISE evaluation framework to the corpus of 662 human-computer dialogues collected in the June 2000 Darpa Communicator data collection. We describe results based on the standard logfile metrics as well as results based on additional qualitative metrics derived using the DATE dialogue act tagging scheme. We show that performance models derived via using the standard metrics can account for 37% of the variance in user satisfaction, and that the addition of DATE metrics improved the models by an absolute 5%.
@inproceedings{Walker:2001:QQE:1073012.1073078,
abstract = {This paper describes the application of the PARADISE evaluation framework to the corpus of 662 human-computer dialogues collected in the June 2000 Darpa Communicator data collection. We describe results based on the standard logfile metrics as well as results based on additional qualitative metrics derived using the DATE dialogue act tagging scheme. We show that performance models derived via using the standard metrics can account for 37% of the variance in user satisfaction, and that the addition of DATE metrics improved the models by an absolute 5%.},
acmid = {1073078},
added-at = {2012-07-16T16:34:52.000+0200},
address = {Stroudsburg, PA, USA},
author = {Walker, Marilyn A. and Passonneau, Rebecca and Boland, Julie E.},
biburl = {https://www.bibsonomy.org/bibtex/2ebcbb44a35659f21cc7102fdbd68f008/anke.gs},
booktitle = {Proceedings of the 39th Annual Meeting on Association for Computational Linguistics},
description = {Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems},
doi = {10.3115/1073012.1073078},
interhash = {7987e7924d9a03501d0936a00c6708f1},
intrahash = {ebcbb44a35659f21cc7102fdbd68f008},
keywords = {course hci speech},
location = {Toulouse, France},
numpages = {8},
pages = {515--522},
publisher = {Association for Computational Linguistics},
series = {ACL '01},
timestamp = {2012-07-16T16:34:52.000+0200},
title = {Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems},
url = {http://dx.doi.org/10.3115/1073012.1073078},
year = 2001
}