首页 > 解决方案 > ElasticSearch _search 检索对象数组中的嵌套对象

问题描述

我正在我的 ES 中保存对大样本量进行的调查答案。我需要编写一个查询来检索在特定日期范围内提交的调查。但是我不想在 _source 中返回整个调查对象

以下是我的查询

GET _search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "surveyTimestamp": {
              "gte": 1609439400000,
              "lte": 1635704999000
            }
          }
        }
      ]
    }
  }
}

这将返回一个响应,其中 hits 数组中的对象如下所示

{
    "_index": "students",
    "_type": "_doc",
    "_id": "611e236fbd5d690008a36f02",
    "_score": 0,
    "_source": {
        "studentId": "d957de1a-1064-4ec3-9d24-2b0964fb0a9f",
        "surveyAnswers": [
            ....
        ],
        "owner": "090c0989-87b1-4b08-a408-5a4d8f16b2bd",
        "addedTimestamp": 1629365103,
        "updatedTimestamp": 1629365103,
        "user": {
            "id": "64573ba7-4538-4a7c-9cfe-e9870b79483d",
            "phoneNumber": "55555555",
            "name": "John I"
        },
        "surveyTimestamp": 1629363139554
    }
}

现在在surveyAnswers 数组中有许多不同的对象,但我只对如下所示的对象感兴趣:

{
    "sectionId": "aea20aa6-1d9b-43c8-8f20-a6c4828f6c7f",
    "instanceId": 1,
    "tags": [
        "household_details"
    ],
    "answers": [
        {
            "questionId": "289db4c7-df63-42e2-8d22-fd0000312cc4",
            "answerValues": "Spouse",
            "tags": [
                "dependent_name"
            ]
        },
        {
            "questionId": "01f5298b-e589-4adf-8c50-6463c0c19b59",
            "answerValues": "Spouse: Wife/ Husband",
            "tags": [
                "relationship_to_primary"
            ]
        },
        {
            "questionId": "031a66ed-46e9-4c2e-86e6-230c70bc9cb6",
            "answerValues": "28",
            "tags": [
                "dependent_age"
            ]
        },
        {
            "questionId": "d1c34cbe-f195-405f-8afc-dd89a3f1ec6c",
            "answerValues": "Female",
            "tags": [
                "dependent_gender"
            ]
        },
        {
            "questionId": "37fa60f2-1096-48bd-bf0d-d79fd3f0566b",
            "answerValues": "Graduation or above",
            "tags": [
                "spouse_education_details"
            ]
        },
        {
            "questionId": "59376597-f3c3-4e2e-9c30-51cd484b5e5f",
            "answerValues": "No",
            "tags": [
                "dependent_disability"
            ]
        }
    ]
}

此对象表示回答调查的人的依赖对象,并且在surveyAnswers 数组中可能有多个依赖对象

我想在我的查询响应中检索每个依赖项的性别。

这个 ES 查询由 AWS 上的 lambda 函数调用,我希望在 30 秒内得到响应。最初,我正在获取整个数据并只是解析结果以找到标签值为“household_details”的对象,然后从中提取相关性别。

然而,即使 ES 需要大量时间返回结果集的列表(接近 1 分钟或 2 分钟),我也需要在 ES 端执行某种程度的投影以缩减结果以获得更快的响应。

我浏览了有关如何检索嵌套字段的文档https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html但我无法理解如何在我的结构上实现它有。

到目前为止,我只能通过做得到 studentId 和surveyTimestamp

GET _search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "surveyTimestamp": {
              "gte": 1609439400000,
              "lte": 1635704999000
            }
          }
        }
      ]
    }
  },
  "fields": [
    "studentId",
    "user.id",
    "surveyTimestamp"
  ],
  "_source": false
}

如何在我的回复中包含所有family_details 调查答案的dependent_gender 值?

标签: elasticsearch

解决方案


推荐阅读