копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

A Cross-collection Mixture Model for Comparative Text Mining

C. Zhai, A. Velivelli, и B. Yu. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, стр. 743--748. New York, NY, USA, ACM, (2004)
DOI: 10.1145/1014052.1014150

Аннотация

In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of comparative text mining is to discover any latent common themes across all collections as well as summarize the similarity and differences of these collections along each common theme. This general problem subsumes many interesting applications, including business intelligence and opinion summarization. We propose a generative probabilistic mixture model for comparative text mining. The model simultaneously performs cross-collection clustering and within-collection clustering, and can be applied to an arbitrary set of comparable text collections. The model can be estimated efficiently using the Expectation-Maximization (EM) algorithm. We evaluate the model on two different text data sets (i.e., a news article data set and a laptop review data set), and compare it with a baseline clustering method also based on a mixture model. Experiment results show that the model is quite effective in discovering the latent common themes across collections and performs significantly better than our baseline mixture model.

Линки и ресурсы

ключ BibTeX: Zhai:2004:CMM:1014052.1014150
тип записи: inproceedings
адрес: New York, NY, USA
название книги: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
год: 2004
страницы: 743--748
издательство: ACM
серии: KDD '04
location: Seattle, WA, USA
acmid: 1014150
isbn: 1-58113-888-1
numpages: 6
DOI: 10.1145/1014052.1014150
url: http://doi.acm.org/10.1145/1014052.1014150

тэги

@dallmann- тэги данного пользователя выделены

Цитировать эту публикацию

@inproceedings{Zhai:2004:CMM:1014052.1014150, abstract = {In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of comparative text mining is to discover any latent common themes across all collections as well as summarize the similarity and differences of these collections along each common theme. This general problem subsumes many interesting applications, including business intelligence and opinion summarization. We propose a generative probabilistic mixture model for comparative text mining. The model simultaneously performs cross-collection clustering and within-collection clustering, and can be applied to an arbitrary set of comparable text collections. The model can be estimated efficiently using the Expectation-Maximization (EM) algorithm. We evaluate the model on two different text data sets (i.e., a news article data set and a laptop review data set), and compare it with a baseline clustering method also based on a mixture model. Experiment results show that the model is quite effective in discovering the latent common themes across collections and performs significantly better than our baseline mixture model.}, acmid = {1014150}, added-at = {2014-06-02T13:05:16.000+0200}, address = {New York, NY, USA}, author = {Zhai, ChengXiang and Velivelli, Atulya and Yu, Bei}, biburl = {https://www.bibsonomy.org/bibtex/27961fc9112edaae193fe8a06567a9498/dallmann}, booktitle = {Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining}, doi = {10.1145/1014052.1014150}, interhash = {9d87e8724a472492847e92b0fa19ed2e}, intrahash = {7961fc9112edaae193fe8a06567a9498}, isbn = {1-58113-888-1}, keywords = {mining politics text}, location = {Seattle, WA, USA}, numpages = {6}, pages = {743--748}, publisher = {ACM}, series = {KDD '04}, timestamp = {2014-06-02T13:05:16.000+0200}, title = {A Cross-collection Mixture Model for Comparative Text Mining}, url = {http://doi.acm.org/10.1145/1014052.1014150}, year = 2004 }

искать в

Метаданные

Последнее изменение 11 лет назад
Создан 11 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!