@jil

Headline Generation Using a Training Corpus

, and . Lecture Notes in AI, Second International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, February 18 to 24, 2001., Springer, (2001)

Abstract

This paper discusses fundamental issues involved in word selection for title generation. We review several common methods that have been used for title generation and compare the performance of those methods using an F1 metric. Both a KNN (k nearest neighbor) method, which we are the first to apply to title generation, and a limited-vocabulary Naïve Bayesian method outperform other evaluated methods with an F1 score of over 20%. We conclude that KNN (k nearest neighbor) is a simple and promising method in title generation under the assumption that strong content overlap exists between the training corpus and the test collection. We also point out ways to improve the performance both from the learning side and from the generation side.

Links and resources

Tags