In last days I've managed to finish my wrestling with Pythons awful packaging systems and have managed to publish Lemmagen lemmatizer to Python Pypi repository.
To install it just run
pip install Lemmagen
inside your favorite Python/virtualenv environment. Note that installation requires a working C++ compiler for your platform.
Then to use it, instantiate Lemmatizer
class and call lemmatize()
on it. By default the lemmatizer is instantiated with slovenian dictionary, others are avaiable via dictionary
keyword argument to the constructor.
Example:
import lemmagen.lemmatizer
from lemmagen.lemmatizer import Lemmatizer
lemmatizer = Lemmatizer(dictionary=lemmagen.DICTIONARY_SLOVENE)
print lemmatizer.lemmatize("hodimo")
The source of the bindings is available as a part of Lemmagen slovene_lemmatizer project. The project is licensed under LGPLv2.