首页 > 解决方案 > MongoDB Shell 聚合管道未按我希望的方式分组

问题描述

设计并实施 MongoDB 聚合管道,以按州显示在美国设有办事处的所有公司的员工总数。

我原来的解决方案是这样的:

db.research.aggregate([
   { $match: { "offices.country_code": "USA" } },
   { $group: { _id: "$offices.state_code", total: { $sum: "$number_of_employees" } } }
])

问题是每个文档可以有多个offices,导致许多返回的结果是不同状态代码的组合。例如,这些是一些结果:

{ "_id" : [ null, "IL" ], "total" : 61 }
{ "_id" : [ "NY", "CA", null ], "total" : 20 }
{ "_id" : [ "UT", "CA", "NY", null ], "total" : 0 }
{ "_id" : [ null, "GA", null, null ], "total" : 0 }
{ "_id" : [ "WA", "WA", "NY", "CA", "MA", null ], "total" : 1650 }
{ "_id" : [ "TX", "IL" ], "total" : 22 }
{ "_id" : [ "MA", "NV" ], "total" : 0 }
{ "_id" : [ "NY", "CO", "CA" ], "total" : 0 }
{ "_id" : [ "MA", "MA" ], "total" : 25 }
{ "_id" : [ "VA", "NY", "CA", null, null ], "total" : 5 }
{ "_id" : [ "CA", "CA", "NY", "IL", "CA", null, null ], "total" : 0 }
{ "_id" : [ "CA", null, "NY", "CA", null ], "total" : 50 }
{ "_id" : [ "CA", "ME" ], "total" : 2 }
{ "_id" : [ "MA" ], "total" : 255864 }
{ "_id" : [ "DE" ], "total" : 409 }
{ "_id" : [ "DC", "CA", "VA" ], "total" : 7000 }
{ "_id" : [ "ND" ], "total" : 18 }
{ "_id" : [ null, null, "VA", null, null, null, "VT", null, null ], "total" : 0 }
{ "_id" : [ "NC", null, null, null ], "total" : 290 }
{ "_id" : [ "SC", "NY" ], "total" : 0 }

相反,我希望将每个“state_code”与该州的员工人数分开列出。如果一个文档有多个状态代码,它应该计入每个状态代码。例如,如果一个文档有 1000 名员工并且州代码为“NY”和“CA”,则 1000 应计入这两个总和。我不希望它像当前那样列出状态代码的唯一组合。有任何想法吗?

标签: mongodbaggregation-framework

解决方案


组前解构offices数组,

  • $unwind解构offices数组
db.research.aggregate([
  { $match: { "offices.country_code": "USA" } },
  { $unwind: "$offices" },
  {
    $group: {
      _id: "$offices.state_code",
      total: { $sum: "$number_of_employees" }
    }
  }
])

操场


推荐阅读