首页 > 解决方案 > 通过elasticsearch.index、body结构映射向elasticsearch添加新文档

问题描述

我正在使用烧瓶(基于 Miguel Grinberg Megatutorial)构建类似博客的应用程序,并且我正在尝试设置支持自动完成功能的 ES 索引。我正在努力正确设置索引。

我从(工作)简单的索引机制开始:

from flask import current_app

def add_to_index(index, model):
    if not current_app.elasticsearch:
        return
    payload = {}
    for field in model.__searchable__:
        payload[field] = getattr(model, field)
    current_app.elasticsearch.index(index=index, id=model.id, body=payload)

在与谷歌玩得开心之后,我发现我的身体可能看起来像那样(可能使用较少的分析器,但我正在应对我在某处发现它的方式,作者声称它可以工作):

{
 "settings": {
"index": {
  "analysis": {
    "filter": {},
    "analyzer": {
      "keyword_analyzer": {
        "filter": [
          "lowercase",
          "asciifolding",
          "trim"
        ],
        "char_filter": [],
        "type": "custom",
        "tokenizer": "keyword"
      },
      "edge_ngram_analyzer": {
        "filter": [
          "lowercase"
        ],
        "tokenizer": "edge_ngram_tokenizer"
      },
      "edge_ngram_search_analyzer": {
        "tokenizer": "lowercase"
      }
    },
    "tokenizer": {
      "edge_ngram_tokenizer": {
        "type": "edge_ngram",
        "min_gram": 2,
        "max_gram": 5,
        "token_chars": [
          "letter"
        ]
      }
    }
  }
}
 },
"mappings": {
field: {
  "properties": {
    "name": {
      "type": "text",
      "fields": {
        "keywordstring": {
          "type": "text",
          "analyzer": "keyword_analyzer"
        },
        "edgengram": {
          "type": "text",
          "analyzer": "edge_ngram_analyzer",
          "search_analyzer": "edge_ngram_search_analyzer"
        },
        "completion": {
          "type": "completion"
        }
      },
      "analyzer": "standard"
    }
     }
   }
 }
}

我发现我可以将原始机制修改为:

    for field in model.__searchable__:
    temp = getattr(model, field)
    fields[field] = {"properties": {
      "type": "text",
      "fields": {
        "keywordstring": {
          "type": "text",
          "analyzer": "keyword_analyzer"
        },
        "edgengram": {
          "type": "text",
          "analyzer": "edge_ngram_analyzer",
          "search_analyzer": "edge_ngram_search_analyzer"
        },
        "completion": {
          "type": "completion"
        }
      },
      "analyzer": "standard"
    }}
payload = {
    "settings": {
        "index": {
          "analysis": {
            "filter": {},
            "analyzer": {
              "keyword_analyzer": {
                "filter": [
                  "lowercase",
                  "asciifolding",
                  "trim"
                ],
                "char_filter": [],
                "type": "custom",
                "tokenizer": "keyword"
              },
              "edge_ngram_analyzer": {
                "filter": [
                  "lowercase"
                ],
                "tokenizer": "edge_ngram_tokenizer"
              },
              "edge_ngram_search_analyzer": {
                "tokenizer": "lowercase"
              }
            },
            "tokenizer": {
              "edge_ngram_tokenizer": {
                "type": "edge_ngram",
                "min_gram": 2,
                "max_gram": 5,
                "token_chars": [
                  "letter"
                ]
              }
            }
          }
        }
    },
    "mappings": fields
}

但这就是我迷路的地方。我应该将实际内容 (temp=getattr(model, field)) 放在这个文档的什么位置,这样整个过程才能正常工作?我找不到任何示例或文档的相关部分可以涵盖使用稍微复杂的映射等更新索引,这是否正确/可行?我看到的每个指南都涵盖了批量索引,但不知何故我无法建立联系。

标签: elasticsearchelasticsearch-py

解决方案


我觉得你有点迷惑让我试着解释一下。您想要的是在弹性中添加一个文档:

current_app.elasticsearch.index(index=index, id=model.id, body=payload)

哪个使用 elasticsearch-py 库中定义的 index() 方法检查此处的示例: https ://elasticsearch-py.readthedocs.io/en/master/index.html#example-usage body must be your document a simple dict,如文档中的示例所示。

您设置的是不同的索引设置。拿数据库打个比方,你在文档里面设置一个表的模式。

如果要设置给定设置,则要设置设置,您需要使用 put_settings,定义如下: https://elasticsearch-py.readthedocs.io/en/master/api.html?highlight=settings#elasticsearch.client。 ClusterClient.put_settings

我希望它对你有帮助。


推荐阅读