首页 > 解决方案 > Apache Curator:没有领导者被间歇性选择

问题描述

我在我的应用程序中使用 Apache Curator Leader Election Recipe:https ://curator.apache.org/curator-recipes/leader-election.html 。

Zookeeper 版本:3.5.7 策展人:4.0.1

以下是步骤顺序: 1. 每当我的 tomcat 服务器实例启动时,我创建一个 CuratorFramework 实例(每个 tomcat 服务器一个实例)并启动它:

CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString, retryPolicy);
client.start();
if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
    LOGGER.error("Zookeeper connection could not establish!");
    throw new RuntimeException("Zookeeper connection could not establish");
}
  1. 创建一个 LSAdapter 的实例并启动它:
LSAdapter adapter = new LSAdapter(client, <some_metadata>);
adapter.start();

下面是我的 LSAdapter 类:

public class LSAdapter extends LeaderSelectorListenerAdapter implements Closeable {

    //<Class instance variables defined>
    public LSAdapter(CuratorFramework client, <some_metadata>) {
        leaderSelector = new LeaderSelector(client, <path_to_be_used_for_leader_election>, this);
        leaderSelector.autoRequeue();
    }

    public void start() throws IOException {
        leaderSelector.start();
    }

    @Override
    public void close() throws IOException {
        leaderSelector.close();
    }

    @Override
    public void takeLeadership(CuratorFramework client) throws Exception {
        final int waitSeconds = (int) (5 * Math.random()) + 1;

        LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " seconds...");
        LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + " time(s) before.");
        while (true) {
            try {
                Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
                //do leader tasks
            } catch (InterruptedException e) {
                LOGGER.error(name + " was interrupted.");
                //cleanup
                Thread.currentThread().interrupt();
            } finally {

            }
        }
    }
}
  1. 当服务器实例关闭时,关闭 LSAdapter 实例(正在使用的应用程序)并关闭创建的 CuratorFramework 客户端
CloseableUtils.closeQuietly(lsAdapter);
curatorFrameworkClient.close();

我面临的问题是,有时当服务器重新启动时,没有领导者被选举出来。我通过跟踪 takeLeadership() 中的日志来检查这一点。我有两个带有上述代码的 tomcat 服务器实例,连接到同一个 zookeeper quorum,并且大多数情况下,其中一个实例成为领导者,但是当这个问题发生时,它们都成为追随者。请建议我做错了什么。

标签: apache-zookeeperapache-curatorleader-election

解决方案


正如我在策展人的 Jira 上回答的那样,您正在吞下被中断的异常。当你得到 InterruptedException 时,你必须退出你的 takeLeadership()。在您的代码示例中,您只是重置中断状态并继续循环 - 这将导致中断异常的无限循环,顺便说一句。调用后 Thread.currentThread().interrupt(); 你应该退出while循环。


推荐阅读