首页 > 解决方案 > 对集合的子集执行 $lookup

问题描述

我有这个数据

[
{"_id":0,"a":1,"b":1,"source":1},
{"_id":1,"a":1,"c":4,"source":1},
{"_id":2,"a":2,"d":6,"source":1},
{"_id":3,"a":2,"e":6,"source":1},
{"_id":4,"a":2,"f":6,"source":1},
{"_id":5,"a":3,"d":6,"source":1},
{"_id":6,"a":3,"b":1,"source":1},
{"_id":7,"a":3,"f":6,"source":1},
{"_id":8,"a":3,"qq":3,"source":2},
{"_id":9,"a":3,"fl":6,"source":2}
]

我想返回其a字段等于具有a字段的文档的字段的所有文档b。此外,所有内容都必须来自源 1。

最终结果应该是这样的:

[
{"_id":0,a":1,"b":1,"source":1},
{"_id":1,"a":1,"c":4,"source":1},
{"_id":5,"a":3,"d":6,"source":1},
{"_id":6,"a":3,"b":1,"source":1},
{"_id":7,"a":3,"f":6,"source":1}
]

以下查询为我提供了我想要的结果:

myCollection.aggregate([{"$match":{"b":{"$exists":true},"source":1}},
                {"$group":{"_id":null, "a":{"$addToSet":"$a"}}},
                {"$unwind":{"path":"$a"}},
                {"$project":{"_id":false}},
                {"$lookup":
                    {"from": "myCollection",
                     "localField":"a",
                     "foreignField":"a",
                     "as":"results"}},
                {"$project":{"a":false}},
                {"$unwind":{"path":"$results"}},
                {"$replaceRoot":{"newRoot":"$results"}},
                {"$match":{"source":1}}
                ])

但是,必须添加最后一条{"$match":{"source":1}}语句让我想到,对于大型数据集,该$lookup语句会产生很多不需要的结果,然后会被我的最后一条$match语句过滤掉。有没有办法通过限制 $lookup 来自 myCollection 的 source 等于 1 的文档来防止它们的生成?

即替换

{"$lookup":
    {"from": "myCollection"

有类似的东西

{"$lookup":
    {"from": myCollection.match({"source":1})

或者,我可以使用更高效的管道吗?

标签: mongodb

解决方案


$lookup您可以过滤阶段管道中的少量文档。这将有助于获得一些性能并避免不必要的结果。你可以像下面这样使用它:

{
  "$lookup": {
    "from": "collection",
    "let": {
      a_: "$a"
    },
    "pipeline": [
      {
        "$match": {
          $expr: {
            $and: [
              {
                $eq: [
                  "$source",
                  1
                ]
              },
              {
                $eq: [
                  "$a",
                  "$$a_"
                ]
              }
            ]
          }
        }
      }
    ],
    "as": "results"
  }
}

你的$project舞台,

{"$project":{"a":false}}

其实没用,可以省略。


推荐阅读