![]() |
![]() |
Cheshire3: normalizer Module |
|
|||
SimpleNormalizer | Base normalizer. | ||
DataExistsNormalizer | Return '1' if any data exists, otherwise '0' | ||
TermExistsNormalizer | Un-stoplist anonymizing normalizer. | ||
CaseNormalizer | Reduce text to lower case | ||
ReverseNormalizer | Reverse string (eg for left truncation) | ||
SpaceNormalizer | Reduce multiple whitespace to single space character | ||
ArticleNormalizer | Remove leading english articles (the, a, an) | ||
NumericEntityNormalizer | Replace characters matching regular expression with the equivalent numeric character entity | ||
RegexpNormalizer | Either strip, replace or keep data which matches a given regular expression | ||
PossessiveNormalizer | Remove trailing 's or s' from words | ||
IntNormalizer | Turn a string into an integer | ||
StringIntNormalizer | Turn an integer into a 0 padded string, 12 chrs long | ||
StoplistNormalizer | Remove words that match a stopword list | ||
PhraseStemNormalizer | Use a Snowball stemmer to stem multiple words in a phrase (eg from PosPhraseNormalizer) | ||
StemNormalizer | Use a Snowball stemmer to stem the terms | ||
DateStringNormalizer | Turns a Date object into ISO8601 format | ||
RangeNormalizer | Should normalise ranges... | ||
KeywordNormalizer | Given a string, keyword it with proximity. | ||
ExactExpansionNormalizer | |||
WordExpansionNormalizer | |||
DiacriticNormalizer | Slow implementation of Unicode 4.0 character decomposition. |
Generated by Epydoc 3.0alpha2 on Wed Aug 9 18:09:56 2006 | http://epydoc.sf.net |