首页 > 解决方案 > 将新文档添加到生产 Elasticsearch 集群

问题描述

我的 Elasticsearch 集群经常被搜索查询使用。每周一次,我会收到一批需要添加到索引中的新文档。如果我将它们添加到索引中,它将在索引和合并或移动分片时大大降低搜索速度。避免减速的最佳方法是什么?

到目前为止我的解决方案:

1. Spin up a single node empty elastic.
2. Restore index i need to update from a snapshot.
3. Add new documents to this index.
4. Force merge shards
5. Snapshot resulting index.
6. Restore updated index on production cluster.
7. Update aliases to use updated index and delete old index.

我在想从快照恢复不应该占用太多资源。可能需要预热恢复的索引以获得更好的性能。

这是正常的解决方案还是太复杂了?

可能 Elasticsearch 有适当的方法来添加文档而不会停机或集群减速?

标签: elasticsearch

解决方案


500GB on one primary shard, I would clearly fix this before doing anything else. You have 10 nodes so you need to spread the load over all of them. Adding nodes will not help at all.

The official recommendation is to not let shards grow bigger than 10/50GB. So in your case I would split that index to have 10 primary shards (+1 replica each), so that each node can handle a part of the job. Otherwise, there's always only one node doing the write job and two nodes doing the read job, which is not optimal.

So before coming up with a way to circumvent the issue, fix the issue as I described above. Your cluster will be much better off, because 10 nodes should definitely handle 5TB easily without having to resort to a complex update procedure as the one you listed.

Try it out...


推荐阅读