mongodb - MongoDB Aggregator - 将在 30 秒内发生的唯一项目分组
问题描述
我要解决的问题:我在 MongoDB 中有一组文档,它们代表用户进入和退出页面的时刻。我的目标是将这些分组为“会话”。
定义会话:会话是在 30 秒内发生的任何唯一文档块。如果它们具有相同uid
的 、documentId
和,则它们是唯一的clientType
。
目标是改变这个:
[
{
"_id": "1",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:00:22.000Z"
},
{
"_id": "2",
"interactionType": "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:00:32.000Z"
},
{
"_id": "3",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0/a",
"occurredAt": "2020-06-12T17:00:42.000Z"
},
{
"_id": "4",
"interactionType": "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0/a",
"occurredAt": "2020-06-12T17:00:52.000Z"
},
{
"_id": "5",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:03:42.000Z"
},
{
"_id": "6",
"interactionType": "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:03:52.000Z"
}
]
进入这个:
[
{
"_id": "d0-u0-web-2020-06-12T17:00:42.000Z",
"uid": "u0",
"documentId": "d0",
"lastViewedAt": "2020-06-12T17:00:42.000Z",
"totalDurationMilli": 20000,
"history": [
{
"routePath": "/d0",
"clientType": "web",
"totalDurationMilli": 10000
},
{
"routePath": "/d0/a",
"clientType": "web",
"totalDurationMilli": 10000
}
]
},
{
"_id": "d0-u0-web-2020-06-12T17:03:42.000Z",
"uid": "u0",
"documentId": "d0",
"lastViewedAt": "2020-06-12T17:03:42.000Z",
"totalDurationMilli": 10000,
"history": [
{
"routePath": "/d0",
"clientType": "web",
"totalDurationMilli": 10000
},
]
},
]
请注意,两个“会话”文档具有相同documentId
但不同的历史记录组。这是因为,如前所述,我想以这样一种方式分离数据,即每个会话至少相隔 30 秒。
到目前为止,我的聚合器看起来像这样:
[
// Filter by pageEnter and pageExit
{ $match: { interactionType: { $in: ['pageEnter', 'pageExit'] } } },
// Sort by occurredAt
{ $sort: { occurredAt: 1 } },
// Group by special id and and compose history.
{
$group: {
_id: {
uid: '$uid',
documentId: '$documentId',
clientType: '$clientType',
},
history: { $push: '$$ROOT' },
},
},
// Project fields for final document.
{
$project: {
_id: { $concat: ['$_id.documentId', '-', '$_id.uid', '-', '$_id.clientType', '-', { $arrayElemAt: ['$history.occurredAt', 0] }] },
uid: { $arrayElemAt: ['$history.uid', 0] },
documentId: { $arrayElemAt: ['$history.documentId', 0] },
lastViewedAt: { $arrayElemAt: ['$history.occurredAt', 0] },
totalDurationMilli: 'unknown',
history: 1,
},
},
]
吐出这个(mongodb游乐场):
{
_id: "d0-u0-web-2020-06-12T17:00:22.000Z",
uid: "u0",
documentId: "d0",
lastViewedAt: "2020-06-12T17:00:22.000Z",
totalDurationMilli: 'unknown',
history: [
{
"_id": "1",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:00:22.560Z"
},
{
"_id": "2",
"interactionType": "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:00:32.000Z"
},
{
"_id": "3",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0/a",
"occurredAt": "2020-06-12T17:00:42.000Z"
},
{
"_id": "4",
"interactionType: "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0/a",
"occurredAt": "2020-06-12T17:00:52.000Z"
},
{
"_id": "5",
"interactionType": "pageEnter",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:03:42.000Z"
},
{
"_id": "6",
"interactionType": "pageExit",
"uid": "u0",
"documentId": "d0",
"clientType": "web",
"routePath": "/d0",
"occurredAt": "2020-06-12T17:03:52.000Z"
}
]
}
我最大的问题是我无法弄清楚如何正确地对这些项目进行分组。我可以使用任何特定的帮手来解决这个问题吗?