ElasticSearch7学习笔记之SearchTemplate、IndexAlias和SuggestAPI

ElasticSearch7学习笔记之SearchTemplate、IndexAlias和SuggestAPI,第1张

ElasticSearch7学习笔记之SearchTemplate、IndexAlias和SuggestAPI

文章目录

介绍SearchTemplateIndexAliasSuggestAPI

Phrase SuggesterPhrase SuggesterCompletionSuggeserContextSuggester

介绍

SearchTemplate查询模板可以用来解耦,IndexAlias索引别名可以实现封装和解耦,SuggestAPI推荐API可以将输入的文本分解为单词,然后在索引的字段里查找相似的单词并返回。

SearchTemplate

示例如下,给标题做一个match_phrase匹配,q为参数:

POST /_scripts/movies
{
    "script": {
        "lang": "mustache", 
        "source": {
            "_source": [
                "title"
            ], 
            "size": 20, 
            "query": {
                "bool": {
                    "must": {
                        "match_phrase": {
                            "title": "{{q}}"
                        }
                    }
                }
            }
        }
    }
}

使用方法如下,只需要给指定查询模板传参即可:

POST movies/_search/template
{
  "id": "movies",
  "params": {
    "q": "Safe Passage"
  }
}
IndexAlias

示例如下,给某个索引起名为movies-today,并加入过滤器,过滤出rating字段≥10的记录:

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "movies-2020-07-13",
        "alias": "movies-today",
        "filter": {
          "range": {
            "rating": {
              "gte": 10
            }
          }
        }
      }
    }
  ]
}

实现要给movies-2020-07-13索引插入数据:

POST movies-2020-07-13/_doc/1
{
  "name": "n1",
  "rating":11
}


POST movies-2020-07-13/_doc/2
{
  "name": "n2",
  "rating":9
}

然后对索引别名查询即可:

POST movies-today/_search
{
  "query": {"match_all": {}}
}
SuggestAPI

ES7中总共有四种推荐器:Term/Phrase Suggester、Complete/Context Suggester。

Phrase Suggester

先插入测试数据:

POST _bulk
{"index": {"_index": "article", "_id": 1}}
{"body": "lucene is very cool"}
{"index": {"_index": "article", "_id": 2}}
{"body": "ElasticSearch is built on top of lucene"}
{"index": {"_index": "article", "_id": 3}}
{"body": "ElasticSearch rocks"}
{"index": {"_index": "article", "_id": 4}}
{"body": "Elastic is the corporation of ELK stack"}
{"index": {"_index": "article", "_id": 5}}
{"body": "ELK stack rocks"}
{"index": {"_index": "article", "_id": 6}}
{"body": "Elastic is rock solid"}

然后编写查询体,给出Suggester,这里是对文本luece rock进行缺失建议

POST article/_search
{
  "size": 20,
  "query": {"match": {
    "body": "luece rock"
  }},
  "suggest": {
    "term-suggestion": {
      "text": "luece rock",
      "term": {
        "suggest_mode": "missing",
        "field": "body"
      }
    }
  }
}

有三种建议模式:Missing(如果指定文本就是已存在的字段,就不会推荐)、Popular(推荐出现频率更高的词)和Always(不管文本是不是已存在的字段,都进行推荐),所以上面的例子输出中的suggest部分如下所示

"suggest" : {
    "term-suggestion" : [
      {
        "text" : "luece",
        "offset" : 0,
        "length" : 5,
        "options" : [
          {
            "text" : "lucene",
            "score" : 0.6,
            "freq" : 4
          }
        ]
      },
      {
        "text" : "rock",
        "offset" : 6,
        "length" : 4,
        "options" : [ ]
      }
    ]
  }

但如果把上面的rock改成hock也不会对它进行推荐,这时加入prefix_length字段,令其为0即可:

POST article/_search
{
  "size": 20,
  "query": {"match": {
    "body": "luece builf hock"
  }},
  "suggest": {
    "term-suggestion": {
      "text": "luece builf hock",
      "term": {
        "suggest_mode": "missing",
        "field": "body",
        "prefix_length": 0
      }
    }
  }
}

输出的suggest字段如下:

"suggest" : {
    "term-suggestion" : [
      {
        "text" : "luece",
        "offset" : 0,
        "length" : 5,
        "options" : [
          {
            "text" : "lucene",
            "score" : 0.6,
            "freq" : 2
          }
        ]
      },
      {
        "text" : "builf",
        "offset" : 6,
        "length" : 5,
        "options" : [
          {
            "text" : "built",
            "score" : 0.8,
            "freq" : 1
          }
        ]
      },
      {
        "text" : "hock",
        "offset" : 12,
        "length" : 4,
        "options" : [
          {
            "text" : "rock",
            "score" : 0.75,
            "freq" : 1
          }
        ]
      }
    ]
  }
Phrase Suggester

phrase建议器可以在term建议器的基础上增加一些逻辑,例如max_errors控制返回的结果中错误单词数,confidence控制返回结果的置信度阈值(此阈值越高,返回结果数越少),也可以加入高亮,指定高亮标签:

POST article/_search
{
  "suggest": {
    "my_suggestion": {
      "text": "lucne and elasticsear rodk very well",
      "phrase": {
        "field": "body",
        "max_errors": 3,
        "confidence": 1,
        "direct_generator": [
          {"field": "body", "suggest_mode": "missing"}
          ],
          "highlight": {
            "pre_tag": "",
            "post_tag": ""
          }
      }
    }
  }
}

输出的suggest部分如下:

"suggest" : {
    "my_suggestion" : [
      {
        "text" : "lucne and elasticsear rodk very well",
        "offset" : 0,
        "length" : 36,
        "options" : [
          {
            "text" : "lucene and elasticsearch rock very well",
            "highlighted" : "lucene and elasticsearch rock very well",
            "score" : 1.6991E-4
          },
          {
            "text" : "lucene and elasticsearch rocks very well",
            "highlighted" : "lucene and elasticsearch rocks very well",
            "score" : 1.6991E-4
          },
          {
            "text" : "lucene and elasticsearch rodk very well",
            "highlighted" : "lucene and elasticsearch rodk very well",
            "score" : 1.393378E-4
          }
        ]
      }
    ]
  }
CompletionSuggeser

补全建议器提供了自动补全功能。
使用时要先给文档设置Mapping,指定对哪个字段进行补全:

PUT article
{
  "mappings": {
    "properties": {
      "body": {
        "type": "completion"
      }
    }
  }
}

然后插入数据,并进行补全查询,指定前缀和要补全的字段即可:

POST article/_search
{
  "suggest": {
    "YOUR_SUGGESTION": {
      "prefix": "e",
      "completion": {
        "field": "body"
      }
    }
  }
}
ContextSuggester

这是对补全建议器的扩展,可以在搜索中加入更多的上下文信息。es中可以定义Category(任意字符串)和Geo(地理位置信息)两种上下文。
实现上下文建议器的步骤有三:定制Mapping;索引数据并加入上下文信息;结合上下文进行建议查询。
使用示例如下,先给文档设置Mapping,让某个字段的类型为补全类型,并给定上下文信息:

PUT comments

PUT comments/_mapping
{
  "properties": {
    "comment_autocomplete": {
      "type": "completion",
      "contexts": [
        {
          "type": "category",
          "name": "comment_category"
        }
        ]
    }
  }
}

然后插入数据,设置补全信息,给定样例输入和对应的上下文:

POST comments/_doc
{
  "comment": "I love the star war movie",
  "comment_autocomplete": {
    "input": ["star wars"],
    "contexts": {
      "comment_category": "movies"
    }
  }
}

POST comments/_doc
{
  "comment": "Where can I find a Starbucks",
  "comment_autocomplete": {
    "input": ["starbucks"],
    "contexts": {
      "comment_category": "coffee"
    }
  }
}

最后进行查询,给定待补全的前缀、使用的补全字段,以及上下文信息:

POST comments/_search
{
  "suggest": {
    "YOUR_SUGGESTION": {
      "prefix": "sta",
      "completion": {
        "field": "comment_autocomplete",
        "contexts": {
          "comment_category": "movies"
        }
      }
    }
  }
}

输出的建议字段如下,可见es根据输入前缀和上下文输出了对应的数据:

  "suggest" : {
    "YOUR_SUGGESTION" : [
      {
        "text" : "sta",
        "offset" : 0,
        "length" : 3,
        "options" : [
          {
            "text" : "star wars",
            "_index" : "comments",
            "_type" : "_doc",
            "_id" : "JHZvRnMBVFEAERRHgcsw",
            "_score" : 1.0,
            "_source" : {
              "comment" : "I love the star war movie",
              "comment_autocomplete" : {
                "input" : [
                  "star wars"
                ],
                "contexts" : {
                  "comment_category" : "movies"
                }
              }
            },
            "contexts" : {
              "comment_category" : [
                "movies"
              ]
            }
          }
        ]
      }
    ]
  }

和phrase、term在精准度、召回率和性能方面的比较:
精准度:Completion > Phrase > Term
召回率:Term > Phrase > Completion
性能:Completion > Phrase > Term

欢迎分享,转载请注明来源:内存溢出

原文地址: https://www.outofmemory.cn/zaji/5715580.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-18
下一篇 2022-12-17

发表评论

登录后才能评论

评论列表(0条)

保存