首页 > 解决方案 > 在 mongodb 中的每个文档的数组中查找重复项

问题描述

假设我有一些具有这种结构的文档:

   _id: ObjectId('444455'),
   name: 'test',
   email: 'email,
   points: {
      spendable: 23,
      history: [
          {
             comment: 'Points earned by transaction #1234',
             points: 1
          },
          {
             comment: 'Points earned by transaction #456',
             points: 3
          },
          {
             comment: 'Points earned by transaction #456',
             points: 3
          }
      ]
   }
}

现在我有一个问题,一些文档在 points.history 数组中包含重复的对象。

有没有办法通过查询轻松找到所有这些重复项?

我已经尝试过这个查询:在 MongoDB 中查找重复记录, 但这显示了所有文档中每个重复行的总数。我需要对每个文档的重复项进行概述,如下所示:

{
    _id: ObjectId('444455') //_id of the document not of the array item itself
    duplicates: [
       {
        comment: 'Points earned by transaction #456
       }
    ]
}, {
    _id: ObjectId('444456') //_id of the document not of the array item itself
     duplicates: [
         {
            comment: 'Points earned by transaction #66234
         },
         {
            comment: 'Points earned by transaction #7989
         }
     ]
}

我怎样才能做到这一点?

标签: mongodbaggregation-framework

解决方案


试试下面的聚合管道

collectionName.aggregate([
  {
    $unwind: "$points.history"
  },
  {
    $group: {
      _id: {
        id: "$_id",
        comment: "$points.history.comment",
        points: "$points.history.points"
      },
      sum: {
        $sum: 1
      },

    }
  },
  {
    $match: {
      sum: {
        $gt: 1
      }
    }
  },
  {
    $project: {
      _id: "$_id._id",
      duplicates: {
        comment: "$_id.comment"
      }
    }
  }
])

推荐阅读