Elasticsearch自定义分词和分词器

上一篇 << 下一篇 >>>正向索引和倒排索引区别


1.自定义分词

①在/usr/local/elasticsearch-6.4.3/plugins/ik/config目录下新建custom目录
vi new_word.dic
老铁
王者荣耀
洪荒之力
共有产权房
一带一路
迦叶

②启用定时器
vi IKAnalyzer.cfg.xml




    IK Analyzer 扩展配置
    
    custom/new_word.dic
     
    
    
    
    
    

2.自定义分词器

自定义分词器 支持拼音和中文分词

PUT /goods
{
   "settings": {
        "analysis": {
            "analyzer": {
                "ik_smart_pinyin": {
                    "type": "custom",
                    "tokenizer": "ik_smart",
                    "filter": ["my_pinyin", "word_delimiter"]
                },
                "ik_max_word_pinyin": {
                    "type": "custom",
                    "tokenizer": "ik_max_word",
                    "filter": ["my_pinyin", "word_delimiter"]
                }
            },
            "filter": {
                "my_pinyin": {
                    "type" : "pinyin",
                    "keep_separate_first_letter" : true,
                    "keep_full_pinyin" : true,
                    "keep_original" : true,
                    "limit_first_letter_length" : 16,
                    "lowercase" : true,
                    "remove_duplicated_term" : true 
                }
            }
        }
  }  
}

重新指定文档类型映射拼音分词类型

POST /goods/_mapping/goods
{
   
  
      "goods": {
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "@version": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "attribute_list": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "category_id": {
            "type": "long"
          },
          "created_time": {
            "type": "date"
          },
          "detail": {
            "type": "text",
             "analyzer":"ik_smart_pinyin",
            "search_analyzer":"ik_smart_pinyin"

          },
          "id": {
            "type": "long"
          },
          "main_image": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "name": {
            "type": "text",
            "analyzer":"ik_smart_pinyin",
            "search_analyzer":"ik_smart_pinyin"

          },
          "revision": {
            "type": "long"
          },
          "status": {
            "type": "long"
          },
          "sub_images": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "subtitle": {
            "type": "text",
          "analyzer":"ik_smart",
         "search_analyzer":"ik_smart"

          },
          "updated_time": {
            "type": "date"
          }
        }
      }
}

推荐阅读:
<< << << << << << << <<<正向索引和倒排索引区别
<< << << << << << << << <<

你可能感兴趣的:(Elasticsearch自定义分词和分词器)