首页 > 解决方案 > 为每日活跃用户优化 MongoDB 查询

问题描述

我在 mongo 中有一个集合,它存储每天的点击流数据它的结构类似于 -

{"utcDate": ISODate(date1), "userToken":"user-id1", ..}
{"utcDate": ISODate(date1), "userToken":"user-id2", ..}
{"utcDate": ISODate(date2), "userToken":"user-id1", ..}
{"utcDate": ISODate(date2), "userToken":"user-id2", ..}

我正在尝试在某个日期范围内获得每日活跃用户。这是我当前的查询-

[
  {
    "$project": {
      "utcDate~~~day": {
        "$let": {
          "vars": {
            "column": "$utcDate"
          },
          "in": {
            "___date": {
              "$dateToString": {
                "format": "%Y-%m-%d",
                "date": "$$column"
              }
            }
          }
        }
      },
      "userToken": "$userToken"
    }
  },
  {
    "$match": {
      "utcDate~~~day": {
        "$gte": {
          "___date": "2019-04-01"
        },
        "$lte": {
          "___date": "2019-04-08"
        }
      }
    }
  },
  {
    "$project": {
      "_id": "$_id",
      "___group": {
        "utcDate~~~day": "$utcDate~~~day"
      },
      "userToken": "$userToken"
    }
  },
  {
    "$group": {
      "_id": "$___group",
      "count": {
        "$addToSet": "$userToken"
      }
    }
  },
  {
    "$sort": {
      "_id": 1
    }
  },
  {
    "$project": {
      "_id": false,
      "utcDate~~~day": "$_id.utcDate~~~day",
      "count": {
        "$size": "$count"
      }
    }
  },
  {
    "$sort": {
      "utcDate~~~day": -1
    }
  }
]

如何优化此查询?我目前在utcDateand上有一个索引userToken,我读到复合索引对此有帮助,我的索引应该是什么样的?

这些是我目前的索引 -

[

    {
        "v" : 2,
        "key" : {
            "userToken" : 1
        },
        "name" : "userToken_1",
        "ns" : "events.user_events"
    },
    {
        "v" : 2,
        "key" : {
            "utcDate" : 1
        },
        "name" : "utcDate_1",
        "ns" : "events.user_events"
    }
]

标签: mongodb

解决方案


推荐阅读