stemmer-german-0.1.1.0: Extract the stem of a German inflected word form.

Safe HaskellSafe
LanguageHaskell2010

NLP.Stemmer.Cistem

Synopsis

Documentation

stem :: Text -> Text Source #

Guess the word stem. This module uses the CISTEM algorithm, published by L. Weißweiler and A. Fraser in "Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers" (2017).

stemCaseInsensitive :: Text -> Text Source #

A case insensitive variant. Use only if the text may be incorrectly upper case.

segment' :: Text -> Segmentation Source #

Split the word into a prefix, the stem and a suffix. In contrast to the stem function umlauts remain unchanged.

segment :: Text -> (Text, Text) Source #

Split the word into stem and suffix. This is supposed to be compatible to the segment function from the reference implementation.

segment'CaseInsensitive :: Text -> Segmentation Source #

A case insensitive variant. Use only if the text may be incorrectly upper case.

segmentCaseInsensitive :: Text -> (Text, Text) Source #

A case insensitive variant. Use only if the text may be incorrectly upper case.