首页 > 解决方案 > 如何在Mongo中按日期分组

问题描述

我有以下收藏:

{
    _id: ObjectId("6192513e0fd6ec5ad80376a7"),
    user: "mark.1239",
    connection: ISODate("2021-11-09T23:55:40.342Z"),
    disconnection: ISODate("2021-11-10T01:10:40.342Z")
},
{
    _id: ObjectId("6192513e0fd6ec5ad80376a9"),
    user: "john.9874",
    connection: ISODate("2021-11-02T07:15:42.318Z"),
    disconnection: ISODate("2021-11-02T08:40:42.318Z")
},
...

我想每隔 15 分钟让每个连接的用户。所以在这种情况下,约翰在 7:30 和 8:15 连接。

我可以按 15 分钟间隔的连接或断开连接进行分组,但它不会让用户在 15 分钟间隔之外连接连接和断开连接字段。

我实际上有这个,但找不到如何修改它以实现我想要的

collection.aggregate([
   { 
      "$group": {
           "_id": {
              "$toDate": {
                  "$subtract": [
                       { "$toLong": "$connection" },
                       { "$mod": [ { "$toLong": "$connection" }, 1000 * 60 * 15 ] }
                   ]
               }
           },
        "count": { "$sum": 1 },
        "user": {"$addToSet":"$user" }
      }
   },
   { $sort: {_id: 1} }
])

是否可以仅使用 mongo 来实现这一点,而无需 javascript ?

标签: mongodbaggregation-framework

解决方案


你的条件不完全清楚,看看这个大纲:

        connection      disconnection
————————————|————————————————|——————————————> t
                                        

 start + end
  x     x   |                |             | NO
        x   |           x    |             | YES/NO?
            |     x     x    |             | YES
            |     x          |    x        | YES/NO?
        x   |                |    x        | YES/NO?
            |                |    x    x   | NO

一种方法是生成 15 分钟的间隔,然后对这些间隔进行过滤。可能是这个:

db.collection.aggregate([
   {
      $group: {
         _id: null,
         data: { $push: "$$ROOT" },
         // Determine total min. and max. time. Might be replaced by static values
         min: { $min: "$connection" },
         max: { $max: "$disconnection" }
      }
   },
   {
      $set: {
         // Round min/max time to 15 Minute interval
         min: { $dateTrunc: { date: "$min", unit: "minute", binSize: 15 } },
         max: { $dateTrunc: { date: "$max", unit: "minute", binSize: 15 } }
      }
   },
   {
      $set: {
         // Get number of 15-Minute intervals between min and max
         steps: {
            $dateDiff: {
               startDate: "$min",
               endDate: "$max",
               unit: "minute"
            }
         }
      }
   },
   {
      $set: {
         // Generate the 15-Minute intervals
         intervals: {
            $map: {
               input: { $range: [0, "$steps", 15] },
               as: "t",
               in: {
                  start: {
                     $dateAdd: {
                        startDate: "$min",
                        unit: "minute",
                        amount: "$$t"
                     }
                  },
                  end: {
                     $dateAdd: {
                        startDate: "$min",
                        unit: "minute",
                        amount: { $add: ["$$t", 15] }
                     }
                  }
               }
            }
         }
      }
   },
   // Transpose array to documents
   { $unwind: "$intervals" },
   // Just some cosmetic
   { $project: { data: 1, start: "$intervals.start", end: "$intervals.end" } },
   {
      $set: {
         data: {
            // Filter matching connections (based on outline above)
            $filter: {
               input: "$data",
               cond: { $and: [{ $lte: ["$$this.connection", "$start"] }, { $gte: ["$$this.disconnection", "$end"] }] }
            }
         }
      }
   },
   // Skip all 15-Minute intervals without any connection
   { $match: { data: { $ne: [] } } },
   // Count users and some cosmetic
   {
      $project: {
         start:1,
         end: 1,
         count: { $size: "$data" },
         user: { $setUnion: "$data.user" }
      }
   }
])

另一种方法可能是$bucket但会很困难,因为它仅适用于数值,即您必须扭曲Date值。

提示:创建一个辅助函数

function between(ts, start, end) {
   let ret = {};
   if (typeof ts == "string") {
      ret["$expr"] = { $and: [{ $lte: ["$" + start, "$" + ts] }, { $gte: ["$" + end, "$" + ts] }] };
   } else {
      ret[start] = { $lte: ts };
      ret[end] = { $gte: ts };
   }
   return ret;
}

它对应于SQL BETWEEN Operator,即WHERE ts BETWEEN start AND end

然后你可以组成这样的条件

{ $match: between("start", "connection", "disconnection") }
{ $match: between(new Date(), "connection", "disconnection") }

这使得编写查询变得更加容易。


推荐阅读