首页 > 解决方案 > Calculate sum, count & avg of a field in nested array - MongoDB

问题描述

I need to get docs from collection with condition :

last_updated -gte ISODate("2020-02-26T22:1o:55.364Z")

Input Collection name : intensity_log

Sample Docs :

[
  {
    junction_id:"J1",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  },
  {
    junction_id:"J2",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  }
]

Expected Output :

[
    {
        junction_id: "J1",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    },
    {
        junction_id: "J2",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    }
]

标签: mongodbmongodb-queryaggregation-framework

解决方案


您可以尝试以下查询:

db.intensity_log.aggregate([
    /** match only docs where there is last_updated > given time, which reduces data size */
    { $match: { 'intensities.data.last_updated': { $gte: ISODate("2020-02-26T22:10:55.364Z") } } },
    /** unwinding array to access objects in it */
    { $unwind: '$intensities' },
    /** filtering objects in data array which matches required criteria */
    { $addFields: { 'intensities.data': { $filter: { input: '$intensities.data', cond: { $gte: ['$$this.last_updated', ISODate("2020-02-26T22:10:55.364Z")] } } } } },
    /** adding required fields into an object named data */
    {
        $addFields: {
            'data.count': { $size: '$intensities.data' },
            'data.sum': {
                $reduce: {
                    input: '$intensities.data',
                    initialValue: 0,
                    in: {
                        $add: ["$$value", "$$this.intensity"]
                    }
                }
            }
        }
    },
    /** adding avg field & extracting lane_id from intensities to data */
    { $addFields: { 'data.avg': { $divide: ["$data.sum", '$data.count'] }, 'data.lane_id': '$intensities.lane_id' } },
    /** Grouping on junction_id & pushing data field created on above stages */
    { $group: { _id: '$junction_id', data: { $push: '$data' } } },
    /** converting _id field name to junction_id & removing _id field from output */
    { $project: { _id: 0, junction_id: '$_id', data: 1 } }
])

注意:您可以通过在数组字段上双重展开来做同样的事情,但它可能会爆炸集合文档并且可能是大型数据集的问题,因此这会更好,因为此查询将在来自集合的相同数量的文档上运行,甚至更少每个阶段之后的文档。

测试: MongoDB-游乐场


推荐阅读