首页 > 解决方案 > MongoDB find and iterate vs count

问题描述

I have a peculiar problem with Mongo.

We have a collection of 800k documents with the following structure.

{
"_id" : ObjectId("5bd844199114bab3b2c19fab"),
"u" : 0,
"c" : 0,
"iden" : "343754856",
"name" : "alan",
"email" : "mkasd@abc.com",
"mobile" : "987654321093456",
"expires" : ISODate("2018-11-29T11:44:25.453Z"),
"created" : ISODate("2018-10-30T11:44:25.453Z")
}

We have indexed iden and name on which we generally query. We tried two types of queries.

  1. db.Collection.find({"iden": "343754856", "name": "alan", "created":
    {"$gt": ....}).count()

    where "created" is an unindexed field.

  2. db.Collection.find({"iden": "343754856", "name": "alan"})

    and iterate over all records to filter based on created.

However, MongoDB seems to be taking enormous amount of time in executing the second query while it was supposed to be an optimization over 1.

Any leads on what is going wrong here? We are using Go library.

标签: mongodbperformancego

解决方案


How could the second version be an optimization over the first?

Your first query retrieves a single number from the MongoDB server: the overall count of the query result. While your second version fetches all matching documents, and you do the counting at the "client" side.

Believe me that the MongoDB can count internally the result documents just as fast as you could in your Go client. Making the MongoDB server send the results, fetching them and unmarshaling them at the client takes orders of magnitude more time (depending on lot of factors).

Please note that if you have a composite index containing "iden" and "name", even if you add more filters (like "created" in your example), the index may still be used, but the MongoDB have to iterate over the partial results to apply the rest of the query. To find out if the index is used, please execute the following command:

db.Collection.find(
    {"iden": "343754856", "name": "alan", "created": {"$gt": ....}
).explain()

推荐阅读