首页 > 解决方案 > 尽可能快地汇总一个大数组(200,000+)的对象

问题描述

上游我支持高频 API 调用。在调用之后,我将拥有大量对象,我需要使用 nodejs 尽可能快地将这些对象有效地汇总(添加)到每个对象的总数中。

我确信我可以用一些循环函数来做到这一点,但想看看这里的聪明人是否能找到更有效的方法。

[{
samplefield1: 123, 
samplefield2: 345, 
samplefield3: 678, 
samplefield4: 910, 
samplefield5: 111
},
{
samplefield1: 123, 
samplefield2: 345, 
samplefield3: 678, 
samplefield4: 910, 
samplefield5: 111
},
{
samplefield1: 123, 
samplefield2: 345, 
samplefield3: 678, 
samplefield4: 910, 
samplefield5: 111
},
{
samplefield1: 123, 
samplefield2: 345, 
samplefield3: 678, 
samplefield4: 910, 
samplefield5: 111
}.... 
]

期望的输出

{
samplefield1total: 1349596065934, 
samplefield2total: 5856960650505, 
samplefield3total: 4344343434343, 
samplefield4total: 44444434342910, 
samplefield5total: 79797696969696
}

作为额外的奖励,如果我还可以从添加到数组的输入中获得所有记录的计数,那将是惊人的,如下所示......

奖励挑战输出:

{
samplefield1total: 1349596065934, 
samplefield2total: 5856960650505, 
samplefield3total: 4344343434343, 
samplefield4total: 44444434342910, 
samplefield5total: 79797696969696,
recordcount: 145634
}

但是如果在上面这样做不是很有效,我可以在外面用一个简单的 array.length 语句来做

感谢您提供的任何帮助

标签: node.js

解决方案


The best you can do is not use array helpers (which have to allocate a call frame for every single instance of simple addition). A simple for-loop is all you need, so you'll need to get the array length anyway.

const data = [{samplefield1: 123, samplefield2: 345, samplefield3: 678, samplefield4: 910, samplefield5: 111},{samplefield1: 123, samplefield2: 345, samplefield3: 678, samplefield4: 910, samplefield5: 111},{samplefield1: 123, samplefield2: 345, samplefield3: 678, samplefield4: 910, samplefield5: 111},{samplefield1: 123, samplefield2: 345, samplefield3: 678, samplefield4: 910, samplefield5: 111}];

let totals = data[0];
let recordcount = data.length;
let keys = Object.keys(totals);
let keycount = keys.length;

for(let i = 1; i < recordcount; i++) {
  for(let j = 0; j < keycount; j++) {
    totals[keys[j]] += data[i][keys[j]];
  }
}

totals.recordcount = recordcount;

console.log(totals);

You're unfortunately not going to get away from this costing O(N*M) time complexity, since the requirement is to access each of M fields for N objects to add them.

As noted in the comments by @Shubh, this will block the event loop. If this is a high frequency API call with each call requiring this much processing, you should offload it to a worker thread. If possible, consider where you're getting the data from (is it a database?) and see if you can make it do the calculation instead.

Note that if your numbers will get much bigger than your sample data in the post you may have to use BigInts for the sums instead, as JavaScript only affords so much precision in its number type.


推荐阅读