首页 > 解决方案 > ElasticSearch 失败并出现 OutOfMemoryError

问题描述

我有 3 个节点的绿色 ES 集群。一切都很好,但最近有一些失败。

[2019-04-22T11:05:37,099][WARN ][o.e.t.OutboundHandler    ] [node_1] send message failed [channel: Netty4TcpChannel{localAddress=/172.0.0.1:9300, remoteAddress=/172.0.0.2:41674}]
java.nio.channels.ClosedChannelException: null
        at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2019-04-22T11:05:37,096][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node_1] fatal error in thread [elasticsearch[node_1][search][T#2]], exiting
java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.ArrayUtil.growExact(ArrayUtil.java:302) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:311) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:715) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.UTF32ToUTF8.all(UTF32ToUTF8.java:247) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.UTF32ToUTF8.end(UTF32ToUTF8.java:231) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.UTF32ToUTF8.build(UTF32ToUTF8.java:194) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.UTF32ToUTF8.convertOneEdge(UTF32ToUTF8.java:137) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.UTF32ToUTF8.convert(UTF32ToUTF8.java:307) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.util.automaton.CompiledAutomaton.<init>(CompiledAutomaton.java:230) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:104) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:81) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.search.WildcardQuery.<init>(WildcardQuery.java:67) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
        at org.apache.lucene.queryparser.classic.QueryParserBase.newWildcardQuery(QueryParserBase.java:644) ~[lucene-queryparser-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:17:41]
        at org.apache.lucene.queryparser.classic.QueryParserBase.getWildcardQuery(QueryParserBase.java:703) ~[lucene-queryparser-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:17:41]
        at org.elasticsearch.index.search.QueryStringQueryParser.getWildcardQuerySingle(QueryStringQueryParser.java:682) ~[elasticsearch-6.7.1.jar:6.7.1]

还有一些问题:

  1. 正如您在日志中看到的通配符查询是最后一次(2 次),这是否意味着 ES 在通配符期间失败或者只是巧合?
  2. 如果是,失败的原因是什么?错误的搜索查询会导致这样的错误吗?

标签: elasticsearch

解决方案


正如lendrojmp 所说,通配符会占用大量内存,特别是如果它以* 开头。

根据文档:

为了防止非常慢的通配符查询,通配符项不应以通配符 * 或 ? 之一开头。通配符查询映射到 Lucene WildcardQuery。

https://www.elastic.co/guide/en/elasticsearch/reference/7.0/query-dsl-wildcard-query.html

您还可以检查日志目录中的 gc.log.xx 文件(默认为 /var/log/elasticsearch)。您可能有更多的洞察力,还可以查看慢查询日志。

还要检查这篇文章:https ://www.elastic.co/blog/found-crash-elasticsearch

您可能由于其他原因耗尽了资源,通配符搜索占用了剩余的内存并使您的服务器崩溃。


推荐阅读