mongodb - 使用 $Max 的 Mongo 聚合
问题描述
我有一个存储历史记录的集合,即每次对数据进行更改时都会创建一个新文档,我需要根据日期字段的最大值提取字段,但是我的查询不断返回所有日期或需要我将字段推送到一个数组中,这使得最终用户难以分析数据。
CSV 格式的预期输出:
MAX(DATE), docID, url, type
1579719200216, 12371, www.foodnetwork.com, food
1579719200216, 12371, www.cnn.com, news,
1579719200216, 12371, www.wikipedia.com, info
样本文件:
{
"document": {
"revenueGroup": "fn",
"metaDescription": "",
"metaData": {
"audit": {
"lastModified": 1312414124,
"clientId": ""
},
"entities": [],
"docId": 1313943,
"url": ""
},
"rootUrl": "",
"taggedImages": {
"totalSize": 1,
"list": [
{
"image": {
"objectId": "woman-reaching-for-basket",
"caption": "",
"url": "",
"height": 3840,
"width": 5760,
"owner": "Facebook",
"alt": "Woman reaching for basket"
},
"tags": {
"totalSize": 4,
"list": []
}
}
]
},
"title": "The 8 Best Food Items of 2020",
"socialTitle": "The 8 Best Food Items of 2020",
"primaryImage": {
"objectId": "woman-reaching-for-basket.jpg",
"caption": "",
"url": "",
"height": 3840,
"width": 5760,
"owner": "Hero Images / Getty Images",
"alt": "Woman reaching for basket in laundry room"
},
"subheading": "Reduce your footprint with these top-performing diets",
"citations": {
"list": []
},
"docId": 1313943,
"revisionId": "1313943_1579719200216",
"templateType": "LIST",
"documentState": {
"activeDate": 579719200166,
"state": "ACTIVE"
}
},
"url": "",
"items": {
"totalSize": "",
"list": [
{
"type": "recipe",
"data": {
"comInfo": {
"list": [
{
"type": "food",
"id": "https://www.foodnetwork.com"
}
]
},
"type": ""
},
"id": 4,
"uuid": "1313ida-qdad3-42c3-b41d-223q2eq2j"
},
{
"type": "recipe",
"data": {
"comInfo": {
"list": [
{
"type": "news",
"id": "https://www.cnn.com"
},
{
"type": "info",
"id": "https://www.wikipedia.com"
}
]
},
"type": "PRODUCT"
},
"id": 11,
"uuid": "318231jc-da12-4475-8994-283u130d32"
}
]
},
"vertical": "food"
}
以下查询:
db.collection.aggregate([
{
$match: {
vertical: "food",
"document.documentState.state": "ACTIVE",
"document.templateType": "LIST"
}
},
{
$unwind: "$document.items"
},
{
$unwind: "$document.items.list"
},
{
$unwind: "$document.items.list.contents"
},
{
$unwind: "$document.items.list.contents.list"
},
{
$match: {
"document.items.list.contents.list.type": "recipe",
"document.revenueGroup": "fn"
}
},
{
$sort: {
"document.revisionId": -1
}
},
{
$group: {
_id: {
_id: {
docId: "$document.docId",
date: {$max: "$document.revisionId"}
},
url: "$document.items.list.contents.list.data.comInfo.list.id",
type: "$document.items.list.contents.list.data.comInfo.list.type"
}
}
},
{
$project: {
_id: 1
}
},
{
$sort: {
"document.items.list.contents.list.id": 1, "document.revisionId": -1
}
}
], {
allowDiskUse: true
})
解决方案
首先,您需要在此处查看$group
聚合的文档。
你应该这样做:
{
$group: {
"_id": "$document.docId"
"date": {
$max: "$document.revisionId"
},
"url": {
$first: "$document.items.list.contents.list.data.comInfo.list.id"
},
"type": {
$first:"$document.items.list.contents.list.data.comInfo.list.type"
}
}
}
这将为您提供所需的输出。
推荐阅读
- javascript - GoogleScript - Google Sheets - 需要修改脚本以跳过任何具有匹配变量的文本行
- javascript - 使用 reactstrap 输入模块的 React ref 没有 Value 道具
- swift - 如何在 Swift 4+ 中解析这个 GTFS 字符串
- python - 没有名为 scipy、spacy、nltk 的模块
- python - 无法在张量流中使用占位符初始化变量
- ios - Swift 如何在 IBaction 之外更改按钮标签?
- c# - Assembly.isDynamic 不包含定义
- java - 如何以编程方式在 Java 中创建 BDE 别名?
- php - 显示来自 2 个输入字段的条件的表格
- jquery - 如何在 jquery-file-upload 响应中传回 id?