java - Keycloak:从嵌入式到远程 Infinispan 重建时,Infinispan 集群的持久缓存同步超时
问题描述
我正在将 keycloak (v.3.4.0 final) 从使用嵌入式 infinispan 切换到专用的远程 infinispan 集群 (v8.2.8.final)。我已经完成了升级过程以在较低环境中使用 infinispan 集群作为远程存储而没有问题。在我的生产设置中,我在 InfinispanCacheInitializer 上遇到超时异常
ERROR [org.keycloak.models.sessions.infinispan.initializer.InfinispanCacheInitializer] (ServerService Thread Pool -- 54) ExecutionException when computed future. Errors: 13: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutExc
eption
at org.infinispan.distexec.DefaultExecutorService$DistributedTaskPart.get(DefaultExecutorService.java:850)
at org.keycloak.models.sessions.infinispan.initializer.InfinispanCacheInitializer.startLoading(InfinispanCacheInitializer.java:102)
at org.keycloak.models.sessions.infinispan.initializer.DBLockBasedCacheInitializer.startLoading(DBLockBasedCacheInitializer.java:75)
at org.keycloak.models.sessions.infinispan.initializer.CacheInitializer.loadSessions(CacheInitializer.java:41)
at org.keycloak.models.sessions.infinispan.InfinispanUserSessionProviderFactory$2.run(InfinispanUserSessionProviderFactory.java:150)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:227)
at org.keycloak.models.sessions.infinispan.InfinispanUserSessionProviderFactory.loadPersistentSessions(InfinispanUserSessionProviderFactory.java:137)
at org.keycloak.models.sessions.infinispan.InfinispanUserSessionProviderFactory$1.onEvent(InfinispanUserSessionProviderFactory.java:108)
at org.keycloak.services.DefaultKeycloakSessionFactory.publish(DefaultKeycloakSessionFactory.java:68)
at org.keycloak.services.resources.KeycloakApplication$2.run(KeycloakApplication.java:165)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:227)
at org.keycloak.services.resources.KeycloakApplication.<init>(KeycloakApplication.java:158)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.jboss.resteasy.core.ConstructorInjectorImpl.construct(ConstructorInjectorImpl.java:150)
at org.jboss.resteasy.spi.ResteasyProviderFactory.createProviderInstance(ResteasyProviderFactory.java:2298)
at org.jboss.resteasy.spi.ResteasyDeployment.createApplication(ResteasyDeployment.java:340)
at org.jboss.resteasy.spi.ResteasyDeployment.start(ResteasyDeployment.java:253)
at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.init(ServletContainerDispatcher.java:120)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.init(HttpServletDispatcher.java:36)
at io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:117)
at org.wildfly.extension.undertow.security.RunAsLifecycleInterceptor.init(RunAsLifecycleInterceptor.java:78)
at io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:103)
at io.undertow.servlet.core.ManagedServlet$DefaultInstanceStrategy.start(ManagedServlet.java:250)
at io.undertow.servlet.core.ManagedServlet.createServlet(ManagedServlet.java:133)
at io.undertow.servlet.core.DeploymentManagerImpl$2.call(DeploymentManagerImpl.java:565)
at io.undertow.servlet.core.DeploymentManagerImpl$2.call(DeploymentManagerImpl.java:536)
at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:42)
at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1508)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1508)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1508)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1508)
at io.undertow.servlet.core.DeploymentManagerImpl.start(DeploymentManagerImpl.java:578)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentService.startContext(UndertowDeploymentService.java:100)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentService$1.run(UndertowDeploymentService.java:81)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at org.infinispan.commons.util.concurrent.NotifyingFutureImpl.get(NotifyingFutureImpl.java:88)
at org.infinispan.distexec.DefaultExecutorService$LocalDistributedTaskPart.getResult(DefaultExecutorService.java:1083)
at org.infinispan.distexec.DefaultExecutorService$DistributedTaskPart.innerGet(DefaultExecutorService.java:868)
at org.infinispan.distexec.DefaultExecutorService$DistributedTaskPart.get(DefaultExecutorService.java:848)
... 44 more
概述
- Keycloak 版本:3.4.0.final(知道这是一个较旧的版本 - 并与自定义实现一起使用 - 不容易升级)
- 启动脚本:
ExecStart={{ keycloak_jboss_home }}/bin/standalone.sh -b
使用standalone.xml
- 启动脚本:
- Infinispan 版本:8.2.8.final
- 从嵌入式本地缓存切换到以下缓存的远程存储配置:
- 用户(分布式)
- 会话(复制)
- authenticationSessions(复制)
- 离线会话(复制)
- loginFailures(复制)
- 授权(复制)
- 离线用户会话数:~300 万
为了从数据库中测试这种大小的缓存同步,我更新了standalone.xml 和standalone.conf 文件中的超时配置
keycloak/keycloak-3.4.0.Final/standalone/configuration/standalone.xml 将协调器超时更新为 3 小时并注释掉查询超时
<subsystem xmlns="urn:jboss:domain:transactions:4.0">
<core-environment>
<process-id>
<uuid/>
</process-id>
</core-environment>
<recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/>
<object-store path="tx-object-store" relative-to="jboss.server.data.dir"/>
<coordinator-environment default-timeout="10800"/>
</subsystem>
....
....
..
<!-- <timeout>
<query-timeout>15</query-timeout>
</timeout>-->
/keycloak/keycloak-3.4.0.Final/bin/standalone.conf 将阻塞超时添加到 JAVA_OPTS
JAVA_OPTS="$JAVA_OPTS -Djboss.modules.system.pkgs=$JBOSS_MODULES_SYSTEM_PKGS -Djava.awt.headless=true -Djboss.as.management.blocking.timeout=10800"
我想指出,在最初几次尝试使其正常工作后,当恢复使用 keycloak 节点上的嵌入式缓存时,数据在大约 1.5 小时内同步良好,没有任何超时错误。
启动 keycloak 后,大约需要 60 分钟才能开始同步离线用户会话。查看 keycloak 正在运行的查询,我可以看到超时错误发生在它开始将 offline_user_session 记录同步到 offlineSessions 缓存后大约 5-10 分钟
在超时之前运行的查询是:
delete from OFFLINE_CLIENT_SESSION where not (exists (select persistent1_.USER_SESSION_ID from OFFLINE_USER_SESSION persistent1_ where persistent1_.USER_SESSION_ID=OFFLINE_CLIENT_SESSION.USER_SESSION_ID))
update OFFLINE_USER_SESSION set LAST_SESSION_REFRESH=$1
DELETE FROM JGROUPSPING WHERE own_addr=$1 AND cluster_name=$2
select count(persistent0_.OFFLINE_FLAG) as col_0_0_ from OFFLINE_USER_SESSION persistent0_ where persistent0_.OFFLINE_FLAG=$1
select userrolema0_.ROLE_ID as col_0_0_ from USER_ROLE_MAPPING userrolema0_ where userrolema0_.USER_ID=$1
select userentity0_.ID as ID1_76_, userentity0_.CREATED_TIMESTAMP as CREATED_2_76_, userentity0_.EMAIL as EMAIL3_76_, userentity0_.EMAIL_CONSTRAINT as EMAIL_CO4_76_, userentity0_.EMAIL_VERIFIED as EMAIL_VE5_76_, userentity0_.ENABLED as ENABLED6_76_, userentity0_.FEDERATION_LINK as FEDERATI7_76_, userentity0_.FIRST_NAME as FIRST_NA8_76_, userentity0_.LAST_NAME as LAST_NAM9_76_, userentity0_.NOT_BEFORE as NOT_BEF10_76_, userentity0_.REALM_ID as REALM_I11_76_, userentity0_.SERVICE_ACCOUNT_CLIENT_LI
select attributes0_.USER_ID as USER_ID4_72_0_, attributes0_.ID as ID1_72_0_, attributes0_.ID as ID1_72_1_, attributes0_.NAME as NAME2_72_1_, attributes0_.USER_ID as USER_ID4_72_1_, attributes0_.VALUE as VALUE3_72_1_ from USER_ATTRIBUTE attributes0_ where attributes0_.USER_ID=$1
select persistent0_.OFFLINE_FLAG as OFFLINE_1_47_, persistent0_.USER_SESSION_ID as USER_SES2_47_, persistent0_.DATA as DATA3_47_, persistent0_.LAST_SESSION_REFRESH as LAST_SES4_47_, persistent0_.REALM_ID as REALM_ID5_47_, persistent0_.USER_ID as USER_ID6_47_ from OFFLINE_USER_SESSION persistent0_ where persistent0_.OFFLINE_FLAG=$1 order by persistent0_.USER_SESSION_ID limit $2 offset $3
我设置了 Infinispan WebConsole UI,这样我就可以看到缓存同步的进度。每次它有 15k 个条目(在大约 300 万个中)
我对这里的问题并不肯定,因为从数据库同步离线会话对于嵌入式版本来说工作得很好,但是对于远程 infinispan 启动,它似乎有问题,要么是查询的批处理,要么我缺少另一个配置infinispan 侧的钥匙斗篷。
更新 - 进一步测试
设置具有 350 万 OUS/OCS 的数据库快照的测试环境。RDS 实例配置了 5500 IOPS。升级到 keycloak 版本 5.0,发生超时,但在整个数据库上运行真空分析解决了问题,我们能够成功地支持远程 infinispan。但是,在成功运行后,在我们的实时环境中,我们遇到了相同的超时,并且真空分析没有解决问题。
解决方案
推荐阅读
- javascript - 如何在 Javascript 中创建字典字典
- scala - 部分应用的函数是否可以调用其部分应用的自身?
- java - Java 11 XML 解析器在 XHTML 1.1 文档上调用 normalizeDocument() 时暂停并显示实体错误
- google-kubernetes-engine - 耗尽 GKE 并关闭底层 Compute Engine - 怎么做
- python - 了解装饰器python
- c++ - 如何重载 == 运算符以查看具有字符串向量的两个对象是否相等?
- reactjs - 设置状态而不使用 useEffect 重新渲染不起作用
- node.js - 用于 GAE(构建和部署)的 React App + Node Gitlab cicd 管道
- python - Python --- 在第 r 个测试文件中写入该行 / 写入除第 r 个之外的所有训练文件
- python - 日期时间信息错误