Jo Shields a575963da9 Imported Upstream version 3.6.0
Former-commit-id: da6be194a6b1221998fc28233f2503bd61dd9d14
2014-08-13 10:39:27 +01:00

52 lines
1.1 KiB
Plaintext

# German special characters are replaced:
häufig;haufig
üor;uor
björk;bjork
# here the stemmer works okay, it maps related words to the same stem:
abschließen;abschliess
abschließender;abschliess
abschließendes;abschliess
abschließenden;abschliess
Tisch;tisch
Tische;tisch
Tischen;tisch
geheimtür;geheimtur
Haus;hau
Hauses;hau
Häuser;hau
Häusern;hau
# here's a case where overstemming occurs, i.e. a word is
# mapped to the same stem as unrelated words:
hauen;hau
# here's a case where understemming occurs, i.e. two related words
# are not mapped to the same stem. This is the case with basically
# all irregular forms:
Drama;drama
Dramen;dram
# replace "ß" with 'ss':
Ausmaß;ausmass
# fake words to test if suffixes are cut off:
xxxxxe;xxxxx
xxxxxs;xxxxx
xxxxxn;xxxxx
xxxxxt;xxxxx
xxxxxem;xxxxx
xxxxxer;xxxxx
xxxxxnd;xxxxx
# the suffixes are also removed when combined:
xxxxxetende;xxxxx
# words that are shorter than four charcters are not changed:
xxe;xxe
# -em and -er are not removed from words shorter than five characters:
xxem;xxem
xxer;xxer
# -nd is not removed from words shorter than six characters:
xxxnd;xxxnd