uam logo

UAM Text Tools

Description

UAM Text Tools (UTT) is a package of language processing tools developed at Adam Mickiewicz University. Its functionality includes:

The toolkit is destined for processing of raw (not annotated) unrestricted text for any conceivable purpose.

The system is organized as a collection of command-line programs, each performing one operation, e.g. tokenization, lemmatization, spelling correction. The components are independent one from another, the unifying element being the uniform i/o file format.

The components may be combined in various ways to provide various text processing services. Also new components supplied by the used may be easily incorporated into the system provided that they respect the i/o file format conventions.

UTT component programs does not depend on any specific tagset or morphological description format.

Authors and contact

Licence

Software

UTT is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Dictionaries

The dictionary files accompanying the UTT package are subject to separate licenses.

Related papers

Download

UTT software

Dictionaries