首页 > 解决方案 > Mongoose/Mongodb,更新每个文档查询,很慢

问题描述

我在猫鼬中有这个更新查询。这是 1600 个帖子,运行大约需要 5 分钟。

瓶颈是什么?我使用了错误的方法吗?

export const getAndStoreLatestKPI = async () => {
  console.log("start kpi");
  try {
    const marketCaps = await getKPI();
    const stocks = await mongoose.model("stock").find().exec();

    for (const stock of stocks) {
      const marketCap = marketCaps.find(
        (marketCap) => marketCap.i === stock.insId
      );

      if (marketCap != null) {
        const marketCapAdjustedVal =
          stock.country === "Finland" ? marketCap.n * 10 : marketCap.n;

        const update = {
          marketCap: marketCapAdjustedVal,
        };
        console.log(marketCapAdjustedVal);
        await mongoose
          .model("stock")
          .findOneAndUpdate({ insId: stock.insId }, { update });
      }
    }
    console.log("done");
    return Promise.resolve();
  } catch (err) {
    return Promise.reject(err);
  }
};

export const getKPI = async (kpiId: number) => {
  try {
    const kpiFetch = await Axios.get(someurl);
    return Promise.resolve(kpiFetch.data.values);
  } catch (err) {
    return Promise.reject(err);
  }
};

标签: mongodbmongoose

解决方案


所以主要的瓶颈是你的for循环。对于每个库存项目,您执行几个“昂贵”的操作,例如从外部 API 获取数据 + 一次更新,并且您正在逐个执行它们。

我建议你做的是一次循环几个项目。类似于多线程的想法。有几种不同的解决方案,nodejs例如在nodejs 工作线程中如何做到这一点

但是,我个人使用并推荐使用bluebird,它可以为您提供这种能力以及许多其他直接开箱即用的能力。

一些示例代码:

import Bluebird = require('bluebird');
const stocks = await mongoose.model("stock").find().exec();

await Bluebird.map(stocks, async (stock) => {
     const marketCap = marketCaps.find(
        (marketCap) => marketCap.i === stock.insId
      );

      if (marketCap != null) {
        const marketCapAdjustedVal =
          stock.country === "Finland" ? marketCap.n * 10 : marketCap.n;

        const update = {
          marketCap: marketCapAdjustedVal,
        };
        console.log(marketCapAdjustedVal);
        await mongoose
          .model("stock")
          .findOneAndUpdate({ insId: stock.insId }, { update });
      }
}, {concurrency: 25})
// concurrency details how many concurrent process run parallel. the heavier they are the less you want concurrent for obvious reasons.

推荐阅读