Cheshire3: normalizer Module

Module normalizer

Classes [hide private]
SimpleNormalizer Base normalizer.
DataExistsNormalizer Return '1' if any data exists, otherwise '0'
TermExistsNormalizer Un-stoplist anonymizing normalizer.
CaseNormalizer Reduce text to lower case
ReverseNormalizer Reverse string (eg for left truncation)
SpaceNormalizer Reduce multiple whitespace to single space character
ArticleNormalizer Remove leading english articles (the, a, an)
NumericEntityNormalizer Replace characters matching regular expression with the equivalent numeric character entity
RegexpNormalizer Either strip, replace or keep data which matches a given regular expression
PossessiveNormalizer Remove trailing 's or s' from words
IntNormalizer Turn a string into an integer
StringIntNormalizer Turn an integer into a 0 padded string, 12 chrs long
StoplistNormalizer Remove words that match a stopword list
PhraseStemNormalizer Use a Snowball stemmer to stem multiple words in a phrase (eg from PosPhraseNormalizer)
StemNormalizer Use a Snowball stemmer to stem the terms
DateStringNormalizer Turns a Date object into ISO8601 format
RangeNormalizer Should normalise ranges...
KeywordNormalizer Given a string, keyword it with proximity.
ExactExpansionNormalizer  
WordExpansionNormalizer  
DiacriticNormalizer Slow implementation of Unicode 4.0 character decomposition.