Dimensionality reduction for bag-of-words models: PCA vs LSA

Abstract

We study a collection of texts stored as “bags of words” and implement two methods for reducing the dimension of the data. We compare how easy it is to perform authorship identification on the dimensionally-reduced data.

BibTeX key: noauthororeditor
entry type: mastersthesis
year: 2017
institution: Stanford

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy

Dimensionality reduction for bag-of-words models: PCA vs LSA

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on