java - Getting NotLeaderForPartitionException for a very long time
问题描述
I have a 3 node kafka cluster, suddenly one of the node in the cluster was down and i started seeing the NotLeaderForPartitionException
exception in my application logs when sending the message to one of the topics, however for some of the topics i am able post and consume messages.
I could see this problem lasting until all the kafka servers are restarted, after the restart things are all ok.
Now, my question is: why not the new leader not elected for those topics but keep throwing the same NotLeaderForPartitionException
exception and how to get the new leader election happen for these topics ?
Exception Trace:
2020-04-11 22:05:21,747 ERROR [pool-15-thread-297] [KafkaMessageProducer:92] Message send failed:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:94)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:29)
解决方案
Produce 和 Fetch 请求都发送到分区的领导副本。NotLeaderForPartitionException
当请求被发送到现在不是分区的领导副本的分区时,将引发异常。
客户端将有关每个分区的领导者的信息作为缓存进行维护。缓存管理的完整过程如下图所示。
客户端需要通过设置metadata.max.age.ms
in producer 配置来刷新此信息。此标签的默认值为300000 ms
您可以浏览以下 Apache Kafka 文档。
推荐阅读
- angular - POST 请求的 .NET Core CORS 问题
- jenkins - 如何从脚本 [SonarQube 6.5] 动态地将质量门分配给项目?
- node.js - Node.JS 中 ES6 语法的部分表现
- ubuntu - Ubuntu 主题更改
- reactjs - 如何避免导致未定义的嵌套文档的属性分配
- php - PHP:有没有办法过滤“原始”输入GET?
- aem - DefaultGetServlet 扩展 html 的无渲染器无法渲染资源 JcrNodeResource
- apache-httpasyncclient - HttpAsyncClient 5 | 处理 Gzip 内容作为响应的最佳方法
- windows - 视觉代码集成终端搞砸了
- nginx - 如何在lua(openresty)中使用kafka?