mongodb - 在 MongoDB 中的数组中查找/计算重复值
问题描述
我是 mongo 数据库的新手。使用 Robo3t 软件
,我必须根据channel_id找出数组中的重复值。
我做了一项研究,发现需要使用聚合来进行分组并找到相应的计数。
我开发了以下查询,但结果不如预期。
样本文件:
{
"_id" : ObjectId("59b674d141b47e5401897d31"),
"subscribed_channels" : [
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1002",
"channel_name" : "StarGold",
"channelPrice":"75"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1003",
"channel_name" : "SetMax",
"channelPrice":"80"
}
],
"viewer_account_id" : "59b6745b41b47e5401143b3d",
"public_id_type" : "PHONE_NUMBER",
"viewer_id" : "+919322264403",
"role" : "CONSUMER",
"active" : true,
"date_time_created" : NumberLong(1505129681330),
"date_time_modified" : NumberLong(1569320824387)
}
{
"_id" : ObjectId("59b674d141b47e5401897d31"),
"subscribed_channels" : [
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1002",
"channel_name" : "StarGold",
"channelPrice":"75"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
}
],
"viewer_account_id" : "59b6745b41b47e5401143c56",
"public_id_type" : "PHONE_NUMBER",
"viewer_id" : "+919322264404",
"role" : "CONSUMER",
"active" : true,
"date_time_created" : NumberLong(1505129681330),
"date_time_modified" : NumberLong(1569320824387)
}
以上只是文档查看者的2条记录
询问 :
db.getCollection('viewers').aggregate([
{
"$group" :
{_id:{
//viewer_id:"$consumer_id",
enterprise_id:"$subscribed_channels.channel_id",
},
"viewer_id": {
$first: "$viewer_id"
},
count:{$sum:1}
}},
{
"$match": {"count": { "$gt": 1 }}
}
])
实际输出:
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1002",
"1003"
]
},
"consumer_id" : "+919322264403",
"count" : 2.0
}
{
"_id" : {
"enterprise_id" : [
"1001",
"1002",
"1001",
"1001
]
},
"consumer_id" : "+919322264404",
"count" : 2.0
}
预期输出:
我想根据subscribed_channels.channel_id进行分组并分别获得一个计数
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1002",
"1003"
]
},
"consumer_id" : "+919322264403",
"count" : 2.0
}
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1001",
"1002
]
},
"consumer_id" : "+919322264404",
"count" : 3.0
}
没有根据 channel_id 进行分组,计数也不正确。
计数甚至没有给我订阅的频道 ID,也没有给出重复的频道 ID。
请指导我构建一个给出正确结果的查询。
解决方案
试试下面的查询:
询问 :
db.collection.aggregate([
/** project only needed fields & transform fields as you like */
{
$project: {
customer_id: "$viewer_id",
enterprise_id: "$subscribed_channels.channel_id",
count: {
/** Subtract size of original array & newly formed array which has unique values to get count of duplicates */
$subtract: [
{
$size: "$subscribed_channels.channel_id" // get size of original array
},
{
$size: {
$setUnion: ["$subscribed_channels.channel_id", []] // This will give you an array with unique elements & get size of it
}
}
]
}
}
}
]);
测试: MongoDB-游乐场
推荐阅读
- java - 在节点的特定位置获取元素
- python - 如何将 TD Ameritrade 的 API 时间戳转换为 pandas 日期时间?
- typescript - NestJS - 测试套件无法运行从“comment/comment.entity.ts”中找不到模块“src/article/article.entity”
- scala - 改进关于特征的丑陋参数列表
- javascript - 在 React-Native 中从 Firebase 登录/注销的最佳方式?
- mysql - Nodejs:结合两种不同类型的 SQL 结果
- r - 在一系列更新后,rcpp 无法编译 c++ 代码
- mips - Mips SLL 算子
- r - 用函数编写绘图
- mql4 - 如何每天只计算一次指标缓冲区 mql4