The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri search engine, Lemur Toolbar, and ClueWeb09 dataset. Our software and datasets are used widely in scientific and research applications, as well as in some commercial applications.
The Lemur Project's software development philosophy emphasizes state-of-the-art accuracy, flexibility, and efficiency. For example, the Indri search engine provides accurate search for large text collections 'out of the box', and data is stored in an accessible manner to support development of new retrieval strategies. Software from the Lemur Project is distributed under open-source licenses that provide flexibility to scientists and software developers.
O. Hoeber. Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03, page 29--32. Washington, DC, USA, IEEE Computer Society, (2008)
H. Drias. Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01, page 36--39. Washington, DC, USA, IEEE Computer Society, (2011)
H. Zaragoza, B. Cambazoglu, and R. Baeza-Yates. Proceedings of the 19th ACM international conference on Information and knowledge management, page 529--538. New York, NY, USA, ACM, (2010)
A. Al-Maskari, M. Sanderson, and P. Clough. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, page 773--774. New York, NY, USA, ACM, (2007)
E. Jensen, S. Beitzel, O. Frieder, and A. Chowdhury. Special interest tracks and posters of the 14th international conference on World Wide Web, page 1176--1177. New York, NY, USA, ACM, (2005)
E. Voorhees. In Proceedings of the The Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, page 355-370. Berlin, Heidelberg, Springer-Verlag, (2002)
K. Wang, T. Walker, and Z. Zheng. KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, page 1355--1364. New York, NY, USA, ACM, (2009)
K. Järvelin. Proceedings of the 2nd International Conference on the Theory of Information Retrieval. Lecture Notes in Computer Science, 5766, page 289-296. Heidelberg, Springer, (2009)
M. Blank, T. Bopp, T. Hampel, and J. Schulte. Good Tags - Bad Tags. Social Tagging in der Wissensorganisation, page 85-97. Münster, New York, München, Berlin, Waxmann, (2008)
T. Mandl. Proceedings of the 17th ACM Conference on Hypertext and Hypermedia (HT '06) Odense, Denmark, August 22nd-25th., page 73-84. ACM Press, (2006)
P. Heymann, G. Koutrika, and H. Garcia-Molina. WSDM '08: Proceedings of the international conference on Web search and web data mining, page 195--206. New York, NY, USA, ACM, (2008)
A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Proceedings of the 3rd European Semantic Web Conference, volume 4011 of LNCS, page 411-426. Budva, Montenegro, Springer, (June 2006)