mongodb - MongoDB对大型数据集的查询非常慢
问题描述
我正在做一个深度学习项目,其中我有大量的数据集(近 1000 万)在客户集合中。我正在根据要求过滤所有客户列。几乎每个过滤的列都是字符串。我不能在每一列(35 列)上都放置索引,因为这不是一个好主意。有一些复杂的查询以及像组聚合一样。
{
"_id" : ObjectId("5ca35824a7ad6a17e9c6eeb7"),
"batchId" : 1,
"demographicsState" : "Minnesota",
"demographicsGender" : "Female",
"jobCount" : "0 to 6",
"jobCreated" : "No",
"callResolution" : "No",
"customerEffortScore" : 2,
"phoneAccessibility" : "90 to 100",
"callRepTime" : "Just right",
"hadPriorCallsPastThirtyFiveDays" : "Yes",
"autoDebitFlag" : "No",
"servcoName" : "Monitronics",
"demographicsAge" : "45 to 54",
"checkedWebsiteFirst" : "No",
"alarmRelated" : "12-Sensor",
"reasonPrimary" : "19-Alarm, system or equipment related reason",
"inInitialTerm" : "Yes",
"callDuration" : "10 to 19",
"siteKind" : "Residential",
"customerSiteTenureDays" : "326",
"highRisk" : "No",
"monthsLeftUntilContractRenewal" : "26",
"nielsen" : "Savvy suburbs",
"callReason" : "Customer tech support",
"serviceScheduled" : "-",
"hadPriorCallsPastFiveDays" : "Yes",
"dropped" : "No",
"serviceResolution" : "80 to 89",
"dept" : 190,
"serviceRepresentative" : "90 to 100",
"demographicsIncome" : "50,000 - 74,999",
"aarpMember" : "No",
"rmr" : 44.99,
"satisfactionOverall" : 9,
"dropYes" : 1,
"dropNo" : 0,
"cltv" : 4146.578333333334
}
这是我获取数据所需的查询:
db.customers.aggregate(
[{$match:[
{$and:[
{"demographicsState": "Minnesota"},
{"demographicsGender": "Female"},
{"jobCount": "0 to 6"},
{"jobCreated":"Yes"},
{"callResolution": "No"},
{"customerEffortScore": {"$gt":0 "$lt": 8}},
{"phoneAccessibility": "50 to 60"},
{"hadPriorCallsPastThirtyFiveDays": "No"},
{"autoDebitFlag": "Yes"},
{"alarmRelated": "10-Sensor"},
{"callDuration": "20 to 29"},
{"hadPriorCallsPastFiveDays": "Yes"},
{"demographicsIncome":"50,000-74,999"},
{"aarpMember": "Yes"},
{"rmr": {"$gt": 30 $lt: 50 }},
{"dropYes":1}
]
},
{"$group":{"_id": "$demographicsGender", "count":{"$sum":1} }}]}])
我正在对客户表的上述模式中的每一列进行过滤和分组。请让我知道,如果有人有任何想法。
解决方案
推荐阅读
- jsf - Primefaces InputText 在出现在 Primefaces 对话框中时存储先前的值
- php - laravel 的租赁 - 将租户置于维护模式
- linux - 如何在linux中签署我自己的内核模块?
- javascript - 未捕获的语法错误:无法在模块外使用 import 语句 - Cordova
- c++ - 信号处理程序中的 C++ 打印?
- python - python unittest 将断言与上下文管理器结合起来
- javascript - Angular TypeError 无法设置未定义的属性“数据”
- angular - Angular - 从另一个组件加载组件模板
- python - 创建一个以 DataFrame 作为参数并返回分数字典的函数
- java - 如何在 JavaFx 中重置缩放级别?