Inproceedings,

A Simple LNRE Model for Random Charater Sequences

.
Proceedings of JADT 2004, page 411--422. (2004)

Abstract

This paper describes a population model for word frequency distributions based on the Zipf-Mandelbrot law, corresponding to the word frequency distribution induced by a random character sequence. The model, which has convenient analytical and numerical properties, is shown to be adequate for the description of language data extracted by automatic means from large text corpora. It can thus be used to study the problems faced by the statistical analysis of such data in the field of natural-language processing.

Tags

Users

  • @ans

Comments and Reviews