首页 > 解决方案 > 添加 Rancher 主机并使其保持活动状态时遇到问题

问题描述

所以我有一些虚拟机正在运行,我希望有一个(简化的设置),其中我有一台用于 Rancher 的主机(10.100.10.1)和一台用于运行容器的主机(10.100.10.4)。我在管理机器上安装了 Rancher server 1.6.25,在两台机器上都安装了 18.06.1~ce~3-0~ubuntu 的 Docker CE 版本。它们都在 Ubuntu 18.04 LTS 上运行。

在管理机器上,我有一个运行以下设置https://pastebin.com/KgCxQdfH的 nginx,因此它将 80 个流量引导到 8080。运行 Rancher

sudo docker run -d -v <host_vol>:/var/lib/mysql --restart=unless-stopped -p 8080:8080 rancher/server. 

我还在两台机器上运行了 sudo ufw allow 500/udp 和 sudo ufw allow 4500/udp 。我也必须这样做https://docs.docker.com/install/linux/linux-postinstall/#specify-dns-servers-for-docker,因为它不这样做就给出了错误。

问题是,当我尝试添加主机时,我无法注册它,即使在它成功连接之后,Rancher 也很难保持连接处于活动状态。当我注册代理时,起初它给出了这个:

time=“2018-12-17T13:23:28Z” level=info msg=“Host not registered yet. Sleeping 1 second and trying again. reportedUuid=a0ca6f30-a804-4227-5532-8c2692673e56 Attempt=12”
time=“2018-12-17T13:23:29Z” level=info msg=“Host not registered yet. Sleeping 1 second and trying again. reportedUuid=a0ca6f30-a804-4227-5532-8c2692673e56 Attempt=13”
time=“2018-12-17T13:23:30Z” level=info msg=“Host not registered yet. Sleeping 1 second and trying again. reportedUuid=a0ca6f30-a804-4227-5532-8c2692673e56 Attempt=14”
…
time=“2018-12-17T12:28:57Z” level=error msg=“Failed to get connection token for host-api startup: Reached max retry attempts for getting token”

然后过了一会儿它连接:

time=“2018-12-17T13:23:31Z” level=info msg=“Connecting to proxy.” url=“ws://10.100.10.1/v1/connectbackend?token=token”

这比我习惯的需要更长的时间,并且有几次完全失败,这意味着我开始从 10.100.10.1 收到 401(可能是令牌过期?)消息。但即使在我设法连接到它之后,主机仍然在 UI 中处于 Disconnected => Reconnecting 状态。然后在牧场主服务器日志中,我得到以下信息:

2018-12-17 13:24:06,050 ERROR [3a6531c0-b638-4494-bcad-2ee79553901e:3725] [instance:111] [instance.start->(InstanceStart)] [] [ecutorService-4] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance is not     running] for instance [111]
2018-12-17 13:24:07,047 ERROR [7c3e0b91-7037-4df2-96bd-634aba7eca39:3732] [instance:112] [instance.start->(InstanceStart)] [] [ecutorService-3] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance is not     running] for instance [112]
2018-12-17 13:24:07,048 ERROR [c995c17c-6e33-4308-b0e3-f4ded72ca0dc:3736] [instance:113] [instance.start->(InstanceStart)] [] [ecutorService-5] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance is not     running] for instance [113]
2018-12-17 13:24:11,644 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [43] count [3]
2018-12-17 13:24:16,645 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [43] count [4]
2018-12-17 13:24:21,645 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [43] count [5]
2018-12-17 13:24:26,646 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [43] count [6]
2018-12-17 13:24:26,648 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for agent [43] host [8] count [6]

所以我添加的虚拟机一直断开连接。这是对该问题的简化解释,因此如果需要更多信息,我可以提供它,但是在具有以下约束的设置中到底有什么问题:

A) 无法将主机注册到牧场主。它可能会失败很长时间,以至于它一直给出 401:无法为主机 API 启动获取 Rancher 客户端:错误响应 statusCode [401]。状态 [401 未授权]。正文:[code=Unauthorized, baseType=error, message=Unauthorized] 来自 [ http://10.100.10.1/v1]

B)如果主机已注册,则无法保持主机处于活动状态,它会一直处于断开连接/重新连接状态,而活动偶尔会弹出。

C)如果我 ping、curl 等主机,流量似乎可以从主机通过。

标签: dockerrancher

解决方案


推荐阅读