Class TokenFilterName
java.lang.Object
com.azure.core.util.ExpandableStringEnum<TokenFilterName>
com.azure.search.documents.indexes.models.TokenFilterName
- All Implemented Interfaces:
com.azure.core.util.ExpandableEnum<String>
public final class TokenFilterName
extends com.azure.core.util.ExpandableStringEnum<TokenFilterName>
Defines the names of all token filters supported by the search engine.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final TokenFilterNameStrips all characters after an apostrophe (including the apostrophe itself).static final TokenFilterNameA token filter that applies the Arabic normalizer to normalize the orthography.static final TokenFilterNameConverts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist.static final TokenFilterNameForms bigrams of CJK terms that are generated from the standard tokenizer.static final TokenFilterNameNormalizes CJK width differences.static final TokenFilterNameRemoves English possessives, and dots from acronyms.static final TokenFilterNameConstruct bigrams for frequently occurring terms while indexing.static final TokenFilterNameGenerates n-grams of the given size(s) starting from the front or the back of an input token.static final TokenFilterNameRemoves elisions.static final TokenFilterNameNormalizes German characters according to the heuristics of the German2 snowball algorithm.static final TokenFilterNameNormalizes text in Hindi to remove some differences in spelling variations.static final TokenFilterNameNormalizes the Unicode representation of text in Indian languages.static final TokenFilterNameEmits each incoming token twice, once as keyword and once as non-keyword.static final TokenFilterNameA high-performance kstem filter for English.static final TokenFilterNameRemoves words that are too long or too short.static final TokenFilterNameLimits the number of tokens while indexing.static final TokenFilterNameNormalizes token text to lower case.static final TokenFilterNameGenerates n-grams of the given size(s).static final TokenFilterNameApplies normalization for Persian.static final TokenFilterNameCreate tokens for phonetic matches.static final TokenFilterNameUses the Porter stemming algorithm to transform the token stream.static final TokenFilterNameReverses the token string.static final TokenFilterNameFolds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.static final TokenFilterNameNormalizes use of the interchangeable Scandinavian characters.static final TokenFilterNameCreates combinations of tokens as a single token.static final TokenFilterNameA filter that stems words using a Snowball-generated stemmer.static final TokenFilterNameNormalizes the Unicode representation of Sorani text.static final TokenFilterNameLanguage specific stemming filter.static final TokenFilterNameRemoves stop words from a token stream.static final TokenFilterNameTrims leading and trailing whitespace from tokens.static final TokenFilterNameTruncates the terms to a specific length.static final TokenFilterNameFilters out tokens with same text as the previous token.static final TokenFilterNameNormalizes token text to upper case.static final TokenFilterNameSplits words into subwords and performs optional transformations on subword groups. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic TokenFilterNamefromString(String name) Creates or finds a TokenFilterName from its string representation.static Collection<TokenFilterName> values()Gets known TokenFilterName values.Methods inherited from class com.azure.core.util.ExpandableStringEnum
equals, fromString, getValue, hashCode, toString, values
-
Field Details
-
ARABIC_NORMALIZATION
A token filter that applies the Arabic normalizer to normalize the orthography. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ar/ArabicNormalizationFilter.html. -
APOSTROPHE
Strips all characters after an apostrophe (including the apostrophe itself). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html. -
ASCII_FOLDING
Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html. -
CJK_BIGRAM
Forms bigrams of CJK terms that are generated from the standard tokenizer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html. -
CJK_WIDTH
Normalizes CJK width differences. Folds fullwidth ASCII variants into the equivalent basic Latin, and half-width Katakana variants into the equivalent Kana. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html. -
CLASSIC
Removes English possessives, and dots from acronyms. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicFilter.html. -
COMMON_GRAM
Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html. -
EDGE_NGRAM
Generates n-grams of the given size(s) starting from the front or the back of an input token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.html. -
ELISION
Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/util/ElisionFilter.html. -
GERMAN_NORMALIZATION
Normalizes German characters according to the heuristics of the German2 snowball algorithm. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html. -
HINDI_NORMALIZATION
Normalizes text in Hindi to remove some differences in spelling variations. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizationFilter.html. -
INDIC_NORMALIZATION
Normalizes the Unicode representation of text in Indian languages. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/in/IndicNormalizationFilter.html. -
KEYWORD_REPEAT
Emits each incoming token twice, once as keyword and once as non-keyword. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordRepeatFilter.html. -
KSTEM
A high-performance kstem filter for English. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/en/KStemFilter.html. -
LENGTH
Removes words that are too long or too short. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LengthFilter.html. -
LIMIT
Limits the number of tokens while indexing. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilter.html. -
LOWERCASE
Normalizes token text to lower case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html. -
NGRAM
Generates n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenFilter.html. -
PERSIAN_NORMALIZATION
Applies normalization for Persian. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizationFilter.html. -
PHONETIC
Create tokens for phonetic matches. See https://lucene.apache.org/core/4_10_3/analyzers-phonetic/org/apache/lucene/analysis/phonetic/package-tree.html. -
PORTER_STEM
Uses the Porter stemming algorithm to transform the token stream. See http://tartarus.org/~martin/PorterStemmer. -
REVERSE
Reverses the token string. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/reverse/ReverseStringFilter.html. -
SCANDINAVIAN_NORMALIZATION
Normalizes use of the interchangeable Scandinavian characters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.html. -
SCANDINAVIAN_FOLDING_NORMALIZATION
Folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o. It also discriminates against use of double vowels aa, ae, ao, oe and oo, leaving just the first one. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html. -
SHINGLE
Creates combinations of tokens as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html. -
SNOWBALL
A filter that stems words using a Snowball-generated stemmer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/snowball/SnowballFilter.html. -
SORANI_NORMALIZATION
Normalizes the Unicode representation of Sorani text. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ckb/SoraniNormalizationFilter.html. -
STEMMER
Language specific stemming filter. See https://learn.microsoft.com/rest/api/searchservice/Custom-analyzers-in-Azure-Search#TokenFilters. -
STOPWORDS
Removes stop words from a token stream. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html. -
TRIM
Trims leading and trailing whitespace from tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TrimFilter.html. -
TRUNCATE
Truncates the terms to a specific length. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TruncateTokenFilter.html. -
UNIQUE
Filters out tokens with same text as the previous token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/RemoveDuplicatesTokenFilter.html. -
UPPERCASE
Normalizes token text to upper case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/UpperCaseFilter.html. -
WORD_DELIMITER
Splits words into subwords and performs optional transformations on subword groups.
-
-
Constructor Details
-
TokenFilterName
Deprecated.Use thefromString(String)factory method.Creates a new instance of TokenFilterName value.
-
-
Method Details
-
fromString
Creates or finds a TokenFilterName from its string representation.- Parameters:
name- a name to look for.- Returns:
- the corresponding TokenFilterName.
-
values
Gets known TokenFilterName values.- Returns:
- known TokenFilterName values.
-
fromString(String)factory method.