Inproceedings,

Categorizing gigabytes: experiments on the RCV1 Corpus

, , and .
Proc. of the 6th Int. Symp. of Hungarian Researchers on Computational Intelligence (HUCI 2005), page 267--276. Budapest, Hungary, (November 2005)

Abstract

This paper presents categorization results performed by means of HITEC categorizer tool on the new benchmark document collection of text cat- egorization, the Reuters Corpus Volume 1 (RCV1). RCV1 is an archive of over 800,000 manually categorized newswire stories made available by Reuters in 2000 for research purposes. This collection was released to take place of the Reuters-21578 collection that has been used widespread in the text retrieval community. This paper intend to add some interesting result to the characterization of RCV1 and HITEC categorizer.

Tags

Users

  • @jil

Comments and Reviews