首页 > 解决方案 > mongoDB - 从数组中的文档中获取最新日期

问题描述

我想检索具有特定文档最新日期的数组对象。但遗憾的是我无法解决它,我总是以错误告终。

日期格式2020-06-10T13:25:25.645+00:00 datetime.now()

样本数据

collection.insert_one(
    {
        "document_name": "My Document",
        "status": [
             {
                "status_time": datetimeobject,  # 2020-01-02T13:25:25.645+00:00
                "status_title": "Sample Title 1"
             },
             {
                "status_time": datetimeobject,  # 2020-06-10T13:25:25.645+00:00
                "status_title": "Sample Title"
             }
         ]
    })

我试过的

result = collection.find_one({"document_name": "My Document"}, {"status": 1}).sort({"status.status_time": -1}).limit(1)

result = collection.find_one({"document_name": "My Document"}, {"$max": {"status.status_time": -1})

result = collection_projects.find_one({"document_name": "Document"}, {"status": {"$elemMatch": {"$max": "$´status_time"}}})

result = list(collection.find({"document_name": "Document"}, {"_id": 0, "status": 1}).limit(1))

result = collection_projects.find_one(
    {"document_name": "My Document"},
    {"status.status_time": {"$arrayElemAt": -1}})

我正在寻找的结果

{
    "status_time": datetimeobject, # 2020-06-10T13:25:25.645+00:00
    "status_title": "Sample Title 2"  
}

标签: pythonarraysmongodbmongodb-queryaggregation-framework

解决方案


您需要使用聚合来实现这一点:

查询 1:

db.collection.aggregate([
    /** Re-create `status` field with what is needed */
    {
      $addFields: {
        status: {
          $reduce: {
            input: "$status", // Iterate on array
            initialValue: { initialDate: ISODate("1970-06-09T17:56:34.350Z"), doc: {} }, // Create initial values
            in: { // If condition is met push current value to accumulator or return acummulator as is
              initialValue: { $cond: [ { $gt: [ "$$this.status_time", "$$value.initialDate" ] }, "$$this.status_time", "$$value.initialDate" ] },
              doc: { $cond: [ { $gt: [ "$$this.status_time", "$$value.initialDate" ] }, "$$this", "$$value" ] }
            }
          }
        }
      }
    },
    /** 
     * re-create `status` field from `$status.doc`
     * Since it will always be having only on object you can make `status` as an object ratherthan an array
     * Just in case if `status` need to be an array you need do { status: [ "$status.doc" ] }  
     */
    {
      $addFields: { status: "$status.doc" }
    }
  ])

测试: mongoplayground

参考 : $reduce , pymongo

查询 2:

db.collection.aggregate([
    /** unwind on `status` array  */
    {
      $unwind: {
        path: "$status",
        preserveNullAndEmptyArrays: true // preserves doc where `status` field is `[]` or null or missing (Optional)
      }
    },
    /** sort on descending order */
    {
      $sort: { "status.status_time": -1 }
    },
    /** group on `_id` & pick first found doc */
    {
      $group: { _id: "$_id", doc: { $first: "$$ROOT" } }
    },
    /** make `doc` field as new root */
    {
      $replaceRoot: { newRoot: "$doc" }
    }
  ])

测试: mongoplayground

测试两个查询,我相信在一个巨大的数据集上$unwind可能$sort有点慢,类似于在一个巨大的数组上迭代。


推荐阅读