node.js - Updating large amount of documents in mongodb
问题描述
I am working on solution to this problem:
I have got a large collection of entries, that regularly I have to update from the other database. I am using express.js with mongodb. The schedule of updating starts everyday at 1am. I fetch external data, compare to currently existing and update all entries in our databse.
The process looks like: 1) Fetch internal and external data 2) Combine them together (past entries from our database, future entries from the other database) 3) Delete all records in our database 4) InsertMany - records that the program just combined.
So as you can see it's a quite risky task. If any error would occur between deleting and inserting data in the collection, we lose all data.
And there are my questions:
1) Is there any effective way of reverting data that has been just deleted in mongo? Or keep them on hold and insert back on its place, if any error occurs?
2) Is there any other effective way of updating few hundreds/thousands of documents apart from deleteMany -> insertMany // updateOne on each document?
Any advice would be appreciated.
解决方案
An alternative to your set of operation could be:
- fetch external data
- add external data in to a new database/collection
- merge the data together with an aggeration query using $merge (https://docs.mongodb.com/manual/reference/operator/aggregation/merge/)
That way you wouldn't have to delete all your data and could possibly end up with data loss.
推荐阅读
- java - 从给定的字符串输入返回 DayOfTheWeek(作为大写字符串)
- c++ - 如何对排序链表的排序区域进行排序?
- c - 错误:控制可能到达非 void 函数的结尾 - cat 函数
- html - CSS防止DIV在使用文本溢出时换行:省略号
- html - 按钮 CSS 过渡无法正常工作
- vis.js - VisJS 编辑边缘:是否可以使用 editWithoutDrag 并且仍然能够移动箭头的到/从点?
- python - GoogleCloudPlatform 上的 Django 部署
- python - 如何使用机器学习解决时间序列问题
- entity-framework - 在连接表时从存储库返回域对象
- node.js - Twilio:Lamba 函数未进行可编程 Web 调用