apache-zookeeper - 是否可以避免嵌套的 RetryLoop.callWithRetry 调用,以便我获得一致的超时?
问题描述
我已经使用 BoundedExponentialBackoffRetry 配置了一个合理的超时时间,并且当我进行像“create.forPath”这样的调用时,如果 ZK 关闭,它通常可以正常工作。但是,如果当我在 InterProcessReadWriteLock 上调用获取时 ZK 不可用,则它最终超时之前需要更长的时间。
我调用acquire,它被包裹在“RetryLoop.callWithRetry”中,它继续调用findProtectedNodeInForeground,它也被包裹在“RetryLoop.callWithRetry”中。如果我已将 BoundedExponentialBackoffRetry 配置为重试 20 次,则内部重试对 20 个外部重试循环中的每一个循环尝试 20 次,因此它重试 400 次。
我们真的需要一个一致的超时,之后我们就会失败。我在这方面做错了什么吗?如果没有,我想我会在一个新线程中调用麻烦的方法,我可以在我自己的超时后杀死这些方法。
这是重新创建它的示例代码。我在注释后面的行处设置断点,关闭 ZK,然后让它继续并在它重试时获取堆栈跟踪。
public class GoCurator {
public static void main(String[] args) throws Exception {
CuratorFramework cf = CuratorFrameworkFactory.newClient(
"localhost:2181",
new BoundedExponentialBackoffRetry(200, 10000, 20)
);
cf.start();
String root = "/myRoot";
if(cf.checkExists().forPath(root) == null) {
// Stacktrace A showing what happens if ZK is down for this call
cf.create().forPath(root);
}
InterProcessReadWriteLock lcok = new InterProcessReadWriteLock(cf, "/grant/myLock");
// See stacktrace B showing the nested re-try if ZK is down for this call
lcok.readLock().acquire();
lcok.readLock().release();
System.out.println("done");
}
}
Stacktrace A(如果在我调用 create().forPath 时 ZK 已关闭)。这显示了单个重试循环,因此它在正确的尝试次数后存在:
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1499)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1487)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2617)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:242)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:231)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:228)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:219)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:41)
at com.gebatech.curator.GoCurator.main(GoCurator.java:25)
Stacktrace B(如果当我调用 InterProcessReadWriteLock#readLock#acquire 时 ZK 已关闭)。这显示了嵌套的重试循环,因此它直到 20*20 次尝试才会退出。
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:434)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:56)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.CreateBuilderImpl.findProtectedNodeInForeground(CreateBuilderImpl.java:1239)
at org.apache.curator.framework.imps.CreateBuilderImpl.access$1700(CreateBuilderImpl.java:51)
at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1167)
at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:575)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
at org.apache.curator.framework.recipes.locks.StandardLockInternalsDriver.createsTheLock(StandardLockInternalsDriver.java:54)
at org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:225)
at org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237)
at org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:89)
at com.gebatech.curator.GoCurator.main(GoCurator.java:29)
解决方案
事实证明,Curator 如何使用重试是一个长期存在的真实问题。我在这里准备好了修复和 PR:https ://github.com/apache/curator/pull/346 - 我会很感激更多的关注。
推荐阅读
- debugging - 问号而不是调试器内存视图中的实际内存内容,这是什么意思?
- r - 如何向该图添加图例(使用 ggplot())?
- javascript - Dictionary .has("key") 不起作用,必须使用 dictionary["key"] != "undefined"
- node.js - 尽管我必须这样做,但使用相同的初始向量进行加密是否被认为是一种不好的做法
- javascript - Firebase 实时数据库查询不适用于更大的数字
- spring - 为什么 Spring 允许实例化具有 Private 构造函数的 bean?
- airflow - BigQueryInsertJobOperator - 缺少必需的参数,但是哪个?
- php - CORS 策略已阻止从源“http://localhost:4200”访问“http://localhost/api/car/create.php”处的 XMLHttpRequest
- android - 在非视图类中访问视图绑定对象?
- javascript - 将两个具有共同变量的比较合并在一起