The Lemur Toolkit
for Language Modeling and Information Retrieval
The Lemur Toolkit is a open-source toolkit designed to facilitate research in language modeling and information retrieval. Lemur supports a wide range of industrial and research language applications such as ad-hoc retrieval, site-search, and text mining.
The toolkit supports indexing of large-scale text databases, the construction of simple language models for documents, queries, or subcollections, and the implementation of retrieval systems based on language models as well as a variety of other retrieval models. The system is written in the C and C++ languages, and is designed as a research system to run under Unix operating systems, although it can also run under Windows.
As part of the migration of portions of the Lemur Toolkit to SourceForge, we have recently opened up bug tracking, feature requests, and support requests so that you can directly submit these items to us, the developers.
Browsing of bugs, feature requests and support requests are open to anyone, but if you wish to add a bug, feature request, or support request, you need to have an account on SourceForge. If you need to create an account, you can create one here.
- Submit a bug or browse the list of open bugs
- Submit a new feature request or browse the current list of feature requests
- Submit a new request for support or browse the current support requests
News | News and announcements about the Lemur Toolkit, such as the latest release notes, upcoming releases and known problems with current versions. |
Features | An "at-a-glance" listing of features within the Lemur Toolkit. |
The Lemur Toolkit | How to install and use the Lemur Toolkit, together with code-level documentation, applications guides, working with offset annotations and beginners guides. |
Indri Search Engine | More about Indri, Lemur's latest search engine that is also available on its own when all you need is a search engine. Indri has an index capable of indexing very large collections and a structured query language that supports fields and passages. Search the collected works of William Shakespeare with Indri. |
Lemur Wiki | Wiki pages of documentation for the Lemur Toolkit and Indri Search Engine. Includes articles on using the toolkit, programming with the tookit and example code. |
Download | Get all of the source, executables and data files for the toolkit here (hosted by SourceForge). Looking for an older version? Try the download archives. |
People | Key contributors to the Lemur Toolkit. |
Discussion | Open discussion forum for users and developers of the toolkit. |
Tutorials | A set of tutorials and trails to help you get started working with the Lemur Toolkit. |
Sign up | to be notified of new releases and updates to Lemur (hosted by SourceForge). |
The toolkit is in constant development as part of the Lemur Project, a collaboration between the Computer Science Department at the University of Massachusetts and the School of Computer Science at Carnegie Mellon University.
The current system was primarily designed and written at Carnegie Mellon University and at the University of Massachusetts, Amherst. If you have any comments about this work, or are interested in using the toolkit for your own purposes, we would like to hear from you. Please send us some email.
The Lemur Project
Last modified:December 19, 2007. 14:27:09 pm