首页 > 解决方案 > Jenkins JNLP Linux 代理 - Ping 超时

问题描述

过去 8 个月(自 2019 年 12 月以来)我一直面临这个问题,并决定在此处发布 - 我正在连接 Jenkins 代理 - 跨 Azure VNet 和订阅的入站 TCP (JNLP)。

与主服务器相同的 VNet/订阅中的代理连接没有任何问题,也不会发生 ping 超时问题。但是,驻留在其他 Vnet 和订阅中的 Jenkins 代理经常会因为Ping Timeout而断开连接。这些代理位于 AKS 群集中,基于openjdk:8-jdk-alpine映像构建并作为 pod 运行,由部署管理。我们将port 50000其用作所有 JNLP 连接的静态端口。

处于其他 Vnet 状态的代理的日志:

Jul 25, 2020 12:03:33 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:03:33 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/workspace/remoting
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jmeter
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Jul 25, 2020 12:03:34 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.33
Jul 25, 2020 12:03:34 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkins.internaldomain.com/]
Jul 25, 2020 12:03:34 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins.internaldomain.com
  Agent port:    50000
  Identity:      70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins.internaldomain.com:50000
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:03:35 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Jul 25, 2020 12:27:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595679817121 hasn't completed by 1595680057121
    at hudson.remoting.PingThread.ping(PingThread.java:134)
    at hudson.remoting.PingThread.run(PingThread.java:90)

Jul 25, 2020 12:32:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595680117120 hasn't completed by 1595680357121
    at hudson.remoting.PingThread.ping(PingThread.java:134)
    at hudson.remoting.PingThread.run(PingThread.java:90)

Jul 25, 2020 12:37:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595680417120 hasn't completed by 1595680657122
    at hudson.remoting.PingThread.ping(PingThread.java:134)
    at hudson.remoting.PingThread.run(PingThread.java:90)

Jul 25, 2020 12:39:20 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Jul 25, 2020 12:39:30 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Performing onReconnect operation.
Jul 25, 2020 12:39:30 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@3ac9ecc9
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/workspace/remoting
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jmeter
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Jul 25, 2020 12:39:32 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.33
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkins.internaldomain.com/]
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins.internaldomain.com
  Agent port:    50000
  Identity:      70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins.internaldomain.com:50000
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jul 25, 2020 12:39:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:39:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected

ICMP Ping (Echo) 根据组织政策被禁用,并且这里的博客指出 Remoting Ping 与 ICMP Ping 不同,因此禁用 ICMP 并不是担忧之一(我就是这么认为的)

  1. 我们尝试在 Jenkins Master 和 Agents 中禁用 Ping,但没有奏效。

  2. 最初我们运行 Masterport 80并且它被阻止,因此我们怀疑它与非安全端口有关。我们使用 SSL 配置了 Master,port 443但问题仍然存在。

  3. 我们发现 Azure Idle Timeout 是 4 分钟,而 Jenkins 默认 ping 间隔是 5 分钟(300 秒),所以我尝试设置这些配置属性:

    • hudson.slaves.ChannelPinger.pingIntervalSeconds

      默认:300,自定义:108

      描述:master 和 agent 之间的 ping 频率,以秒为单位

    • hudson.slaves.ChannelPinger.pingTimeoutSeconds

      默认值:240,自定义:无更改

      说明:master和agent之间每次ping的超时时间,以秒为单位

    尽管如此,还是没有运气。

以前有没有人遇到过这样的问题,我能找到的只是类似的问题发生在 Windows 代理上

Jenkins 代理名称、身份、主 DNS 和 IP 已更改为看起来通用

标签: jenkinsjnlpazure-aks

解决方案


Jenkins Master 和 Slave 可以保持同一个 VNet 之间的连接,但无法跨 VNet 连接,这意味着特定端口可能会在 VNet 上被阻止。您将需要使用网络安全组 (NSG) 启用跨 VNet 的 ping 端口网络流量。您可以从链接中了解它。

启用后,您可以在 jenkins master 的 VNet 中创建 VM,并尝试连接到 jenkins slave ping 端口(使用 telnet 或类似工具)。如果您能够连接,则意味着 NSG 不会阻止流量,否则 NSG 将阻止流量。


推荐阅读