
Scaling to very very large corpora for natural language disambiguation

, and . ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, page 26--33. Morristown, NJ, USA, Association for Computational Linguistics, (2001)
DOI: http://dx.doi.org/10.3115/1073012.1073017


With a billion word corpus, your algorithm doesn't matter - and you can skip all your clever tricks. Also, active learning works better with huge data sets to pick interesting examples from.

Links and resources

