mongodb - 聚合查询从嵌套子文档中获取最新元素的计数
问题描述
我有一个包含许多用户的 mongodb 数据库,我跟踪的子文档之一是文件上传及其通过审查过程的状态。每个文件上传最终都会有一个附件状态。我希望能够提取一些指标来获取每个上传文件的当前状态总数。我开始构建一个聚合查询,从每个上传的文件中提取最新的附件子文档状态并计算它们。
数据结构如下:
"userName": "johnDoe",
"email": "johnDoe@gmail.com",
"uploads" : [
{
"_id" : ObjectId("adh12451e0012ce9da0"),
"fileName" : "TestDoc.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-01-06T15:26:14.166Z"),
"attachmentStatus" : [ ]
},
{
"_id" : ObjectId("5ff5d6c066cacc0012ed655a"),
"fileName" : "testerABC.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-01-06T15:26:56.027Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("60884f733f88bd00129b9ad4"),
"status" : "Uploaded",
"date" : ISODate("2021-04-22T02:23:00Z")
},
{
"_id" : ObjectId("60884f733f88bd00129b9ad5"),
"status" : "Processing",
"date" : ISODate("2021-04-26T04:54:00Z")
}
]
},
{
"_id" : ObjectId("6075c82a19fdcc0012f81907"),
"fileName" : "Test file.docx",
"fileType" : "application/word",
"created" : ISODate("2021-04-13T16:34:50.955Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("72844f733f88bd11479b9ad7"),
"status" : "Uploaded",
"date" : ISODate("2021-04-23T03:42:00Z")
},
{
"_id" : ObjectId("724986d73f88bd00147c9wt8"),
"status" : "Completed",
"date" : ISODate("2021-04-24T01:37:00Z")
}
]
}
]
"userName": "janeDoe",
"email": "janeDoe@gmail.com",
"uploads" : [
{
"_id" : ObjectId("ej9784652h0012ce9da0"),
"fileName" : "myResume.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-02-13T12:36:14.166Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("15dhdf6f88bd00147c9wt8"),
"status" : "Completed",
"date" : ISODate("2021-04-24T01:37:00Z")
}
]
},
如何为每个上传的文件提取最新的附件状态,然后总结状态?
我想要这样的东西:
{ "status" : "Completed", "Count" : 2 }
{ "status" : "Processing", "Count" : 1 }
...
我非常接近这个聚合查询,但它会抓取每一个状态,而不仅仅是每个文件的单个最新状态。(每个文件一个当前状态)。
db.myDB.aggregate([
{
"$match" : {
"uploads.attachmentStatus": {
"$elemMatch": { "status": { "$exists": true } }
}
}
},
{ $unwind: "$uploads"},
{ $unwind: "$uploads.attachmentStatus"},
{
$sortByCount: "$uploads.attachmentStatus.status"
},
{
$project: {
_id:0,
status: "$_id",
Count: "$count"
}
}
]).pretty();
有什么建议么?
解决方案
演示 - https://mongoplayground.net/p/zzOR9qhqny0
{ $sort: { "uploads.attachmentStatus.date": -1 } },
获取最新的 1st{ $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } }
按uploads._id 对记录进行分组并取顶部状态(即按日期排序后的最新状态)。
询问
{ $sort: { "uploads.attachmentStatus.date": -1 } },
{ $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } },
完成查询
db.collection.aggregate([
{ $match: { "uploads.attachmentStatus": { "$elemMatch": { "status": { "$exists": true } } } } },
{ $unwind: "$uploads" },
{ $unwind: "$uploads.attachmentStatus" },
{ $sort: { "uploads.attachmentStatus.date": -1 } },
{ $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } },
{ $sortByCount: "$status" },
{ $project: { _id: 0, status: "$_id", Count: "$count" } }
])
推荐阅读
- eigenvalue - 找到两个矩阵的共同特征向量
- sql - List the total power consumed for a given metal in the last one hour of the process
- sql-server - 如何使用 docker desktop 将 kubernetes 集群连接到外部 SQL Server 数据库?
- git - Does git checkout branch command actually really compare against remote?
- r - how to insert letters in the values of a column with dplyr?
- javascript - 如何运行 logTruthiness 函数
- ios - 将渐变应用于分段控件的“覆盖”
- python - 带有嵌入图像和 zip 附件的 Python html MIME 电子邮件在 Outlook 中显示问题
- saml - 如何在 java 或 node 中为 SAML 创建自己的 IDP
- matplotlib - 如何在 matplotlib 多列图例中“底部”对齐?