首页 > 解决方案 > Running table data cumulation effectively

问题描述

The algorithm

We have a table (let’s call it status) with status data gathered every second. By status data I mean about five columns with Boolean values.

We have another table (let’s call it status_interval) where we cumulate the data from status, i.e. if consecutive rows (based on timestamps) have exactly the same status data, in status_interval we create a row with start and end timestamp and the status data. When a change in status data occurs, new row is inserted into status_interval.

Now, currently we have 36 pairs of such tables (and we’ll have 56 in the near future), so we created a PHP script where we run serially through all these tables. We set up a cron job to run this script every 5 minutes. This setup worked good enough when we had only 24 pairs of such tables, but now it uses too much of CPU (one pair of tables runs for about 1m40s and uses 80-100% of CPU load).

So we decided to improve the algorithm (not part of this question) and port it into Node JS. And we are thinking of processing the data (each table pairs) concurrently.

Note that cumulation of the status data is not the most important process running on the server and if it fails/exits, we can safely restart it.

TL;DR

My question is: if running async functions concurrently, won’t it use too much of resources (esp the CPU)? If concurrency is not the best approach in our case, what in Node JS would be?

标签: node.jsoptimizationconcurrencyparallel-processing

解决方案


推荐阅读