mongodb - 如何为 mongodb 编写相关的 $lookup 查询?
问题描述
有一些推文数据,例如:
{
"id": 1432568863334539264,
"created_at": "2021-08-31 05:00:07",
"text": "Most of New Orleans went dark on Sunday after Hurricane Ida took out eight transmission lines. A new natural gas po… ",
"user_id": 807095
},
{
"id": 1432578972844531714,
"created_at": "2021-08-31 05:40:17",
"text": "RT @nytimes: Most of New Orleans went dark on Sunday after Hurricane Ida took out eight transmission lines. A new natural gas power plant,…",
"user_id": 1414848022849245186,
"retweet_id": 1432568863334539264,
"retweet_user_id": 807095,
"user_mentions": [
{
"id": 807095,
"indices": [
3,
11
]
}
]
},
{
"id": 1432578589107625990,
"created_at": "2021-08-31 05:38:46",
"text": "RT @nytimes: Most of New Orleans went dark on Sunday after Hurricane Ida took out eight transmission lines. A new natural gas power plant,…",
"user_id": 1345847262010875915,
"retweet_id": 1432568863334539264,
"retweet_user_id": 807095,
"user_mentions": [
{
"id": 807095,
"indices": [
3,
11
]
}
]
}
我想使用单个查找或聚合来解决:
查找在发布后的第一个小时内收到最多转发的一般推文。打印出推文 ID 和它在第一个小时内收到的转发次数。
我没有工作的脚本:
db.tweets_hurricane.aggregate([
{$match: {retweet_id:{$exists:true}}},
{$project: {_id: 0, id: '$id', retweet_id: '$retweet_id', format_date: { $toDate: '$created_at'}, format_date_end: { $add: [{$toDate: '$created_at'}, 1*60*60*1000]}}},
{$group: {_id: '$retweet_id', count: {$sum: {$switch: {branches: [{case: {$lt: ['$format_date', db.tweets_hurricane.findOne({id: '$retweet_id'}, {format_date_end:1})]}, then: 1}], default: 0}}}}},
{$sort: {count: -1}}
])
解决方案
示例数据使解决方案难以理解,但它应该可以工作:
db.tweets_hurricane.aggregate([
// Join tweets and re-tweets
{ $match: { retweet_id: { $exists: false } } },
{
$lookup:
{
from: "tweets_hurricane",
localField: "id",
foreignField: "retweet_id",
as: "retweets"
}
},
// filter retweets created in the first hour after it is published
{
$set: {
retweets: {
$filter: {
input: "$retweets",
as: "tweet",
cond: { $lt: ["$$tweet.created_at", { $add: ["$created_at", 1000 * 60 * 60] }] }
}
}
}
},
// count number of retweets
{ $set: { retweet_count: { $size: "$retweets" } } },
// limit to most retweeted tweet
{ $sort: { retweet_count: -1 } },
{ $limit: 1 },
// finalize output
{
$project: {
id: 1,
retweet_count: 1
}
}
])
推荐阅读
- r - 通过人口统计包正确创建“流行”对象
- amazon-web-services - Terraform AWS:错误:未声明变量的值
- php - 如何在 PHP 中正确应用 gzipped 解码字符串
- python - Python matplotlib 动画输出与显示的图不同
- c - 将数组传递给函数与将变量传递给函数
- c# - 奇怪的句柄必须是对话框中的有效异常
- angular - 使用 Jest 和 Testbed 测试 Angular 9 服务
- flutter - 访问子小部件状态属性
- c# - 如何在我的聊天机器人中添加 AD 身份验证 - MS 机器人框架 v4
- maven - Maven Mojo 在通过阶段调用时不使用反应器。当被称为独立目标时它工作正常