Class TokenFilterName

java.lang.Object
com.azure.core.util.ExpandableStringEnum<TokenFilterName>
com.azure.search.documents.indexes.models.TokenFilterName
All Implemented Interfaces:
com.azure.core.util.ExpandableEnum<String>

public final class TokenFilterName extends com.azure.core.util.ExpandableStringEnum<TokenFilterName>
Defines the names of all token filters supported by the search engine.
  • Field Details

    • ARABIC_NORMALIZATION

      public static final TokenFilterName ARABIC_NORMALIZATION
      A token filter that applies the Arabic normalizer to normalize the orthography. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ar/ArabicNormalizationFilter.html.
    • APOSTROPHE

      public static final TokenFilterName APOSTROPHE
      Strips all characters after an apostrophe (including the apostrophe itself). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html.
    • ASCII_FOLDING

      public static final TokenFilterName ASCII_FOLDING
      Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html.
    • CJK_BIGRAM

      public static final TokenFilterName CJK_BIGRAM
      Forms bigrams of CJK terms that are generated from the standard tokenizer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html.
    • CJK_WIDTH

      public static final TokenFilterName CJK_WIDTH
      Normalizes CJK width differences. Folds fullwidth ASCII variants into the equivalent basic Latin, and half-width Katakana variants into the equivalent Kana. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html.
    • CLASSIC

      public static final TokenFilterName CLASSIC
      Removes English possessives, and dots from acronyms. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicFilter.html.
    • COMMON_GRAM

      public static final TokenFilterName COMMON_GRAM
      Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html.
    • EDGE_NGRAM

      public static final TokenFilterName EDGE_NGRAM
      Generates n-grams of the given size(s) starting from the front or the back of an input token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.html.
    • ELISION

      public static final TokenFilterName ELISION
      Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/util/ElisionFilter.html.
    • GERMAN_NORMALIZATION

      public static final TokenFilterName GERMAN_NORMALIZATION
      Normalizes German characters according to the heuristics of the German2 snowball algorithm. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html.
    • HINDI_NORMALIZATION

      public static final TokenFilterName HINDI_NORMALIZATION
      Normalizes text in Hindi to remove some differences in spelling variations. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizationFilter.html.
    • INDIC_NORMALIZATION

      public static final TokenFilterName INDIC_NORMALIZATION
      Normalizes the Unicode representation of text in Indian languages. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/in/IndicNormalizationFilter.html.
    • KEYWORD_REPEAT

      public static final TokenFilterName KEYWORD_REPEAT
      Emits each incoming token twice, once as keyword and once as non-keyword. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordRepeatFilter.html.
    • KSTEM

      public static final TokenFilterName KSTEM
      A high-performance kstem filter for English. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/en/KStemFilter.html.
    • LENGTH

      public static final TokenFilterName LENGTH
      Removes words that are too long or too short. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LengthFilter.html.
    • LIMIT

      public static final TokenFilterName LIMIT
      Limits the number of tokens while indexing. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilter.html.
    • LOWERCASE

      public static final TokenFilterName LOWERCASE
      Normalizes token text to lower case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html.
    • NGRAM

      public static final TokenFilterName NGRAM
      Generates n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenFilter.html.
    • PERSIAN_NORMALIZATION

      public static final TokenFilterName PERSIAN_NORMALIZATION
      Applies normalization for Persian. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizationFilter.html.
    • PHONETIC

      public static final TokenFilterName PHONETIC
      Create tokens for phonetic matches. See https://lucene.apache.org/core/4_10_3/analyzers-phonetic/org/apache/lucene/analysis/phonetic/package-tree.html.
    • PORTER_STEM

      public static final TokenFilterName PORTER_STEM
      Uses the Porter stemming algorithm to transform the token stream. See http://tartarus.org/~martin/PorterStemmer.
    • REVERSE

      public static final TokenFilterName REVERSE
      Reverses the token string. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/reverse/ReverseStringFilter.html.
    • SCANDINAVIAN_NORMALIZATION

      public static final TokenFilterName SCANDINAVIAN_NORMALIZATION
      Normalizes use of the interchangeable Scandinavian characters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.html.
    • SCANDINAVIAN_FOLDING_NORMALIZATION

      public static final TokenFilterName SCANDINAVIAN_FOLDING_NORMALIZATION
      Folds Scandinavian characters åÅäæÄÆ-&gt;a and öÖøØ-&gt;o. It also discriminates against use of double vowels aa, ae, ao, oe and oo, leaving just the first one. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html.
    • SHINGLE

      public static final TokenFilterName SHINGLE
      Creates combinations of tokens as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html.
    • SNOWBALL

      public static final TokenFilterName SNOWBALL
      A filter that stems words using a Snowball-generated stemmer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/snowball/SnowballFilter.html.
    • SORANI_NORMALIZATION

      public static final TokenFilterName SORANI_NORMALIZATION
      Normalizes the Unicode representation of Sorani text. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ckb/SoraniNormalizationFilter.html.
    • STEMMER

      public static final TokenFilterName STEMMER
      Language specific stemming filter. See https://learn.microsoft.com/rest/api/searchservice/Custom-analyzers-in-Azure-Search#TokenFilters.
    • STOPWORDS

      public static final TokenFilterName STOPWORDS
      Removes stop words from a token stream. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html.
    • TRIM

      public static final TokenFilterName TRIM
      Trims leading and trailing whitespace from tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TrimFilter.html.
    • TRUNCATE

      public static final TokenFilterName TRUNCATE
      Truncates the terms to a specific length. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TruncateTokenFilter.html.
    • UNIQUE

      public static final TokenFilterName UNIQUE
      Filters out tokens with same text as the previous token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/RemoveDuplicatesTokenFilter.html.
    • UPPERCASE

      public static final TokenFilterName UPPERCASE
      Normalizes token text to upper case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/UpperCaseFilter.html.
    • WORD_DELIMITER

      public static final TokenFilterName WORD_DELIMITER
      Splits words into subwords and performs optional transformations on subword groups.
  • Constructor Details

    • TokenFilterName

      @Deprecated public TokenFilterName()
      Deprecated.
      Use the fromString(String) factory method.
      Creates a new instance of TokenFilterName value.
  • Method Details

    • fromString

      public static TokenFilterName fromString(String name)
      Creates or finds a TokenFilterName from its string representation.
      Parameters:
      name - a name to look for.
      Returns:
      the corresponding TokenFilterName.
    • values

      public static Collection<TokenFilterName> values()
      Gets known TokenFilterName values.
      Returns:
      known TokenFilterName values.