kubernetes - Kubernetes master 升级失败后卡在 NotReady
问题描述
我有 1.13.2 版本的 K8S 集群,我想升级到 1.17.x 版本(最新的 1.17)。
我查看了官方说明:https ://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/指出我需要一次升级一个未成年人,即1.14,然后是1.15、1.16和才到1.17。
我做了所有准备(禁用交换),按文档运行所有内容,确定最新的 1.14 是 1.14.10。
当我跑的时候:
apt-mark unhold kubeadm kubelet && \
apt-get update && apt-get install -y kubeadm=1.14.10-00 && \
apt-mark hold kubeadm
出于某种原因,似乎kubectl
也下载了 v1.18。
我继续并尝试运行sudo kubeadm upgrade plan
,但失败并出现以下错误:
[perflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/health] FATAL: [preflight] Some fatal errors occurred:
[ERROR ControlPlaneNodesReady]: there are Notready control-planes in the cluster: [<name of master>]
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
运行时kubectl get nodes
,它说VERSION
master 确实是NotReady
1.18.0 版本,而工人当然是 v1.13.2 和Ready
(未更改)。
如何修复我的集群?
当我尝试升级时,我做错了什么?
解决方案
我在我的实验室中重现了你的问题,发生的事情是你不小心升级了比你想要的更多。更具体地说,您kubelet
在主节点(控制平面)中升级了包。
所以这是我的健康集群版本1.13.2
:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubeadm-lab-0 Ready master 9m25s v1.13.2
kubeadm-lab-1 Ready <none> 6m17s v1.13.2
kubeadm-lab-2 Ready <none> 6m9s v1.13.2
现在我会像你一样kubeadm
松开:kubelet
$ sudo apt-mark unhold kubeadm kubelet
Canceled hold on kubeadm.
Canceled hold on kubelet.
最后我将升级kubeadm
到1.14.1
:
$ sudo apt-get install kubeadm=1.14.10-00
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
conntrack kubelet kubernetes-cni
The following NEW packages will be installed:
conntrack
The following packages will be upgraded:
kubeadm kubelet kubernetes-cni
3 upgraded, 1 newly installed, 0 to remove and 8 not upgraded.
Need to get 34.1 MB of archives.
After this operation, 7,766 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:2 http://deb.debian.org/debian stretch/main amd64 conntrack amd64 1:1.4.4+snapshot20161117-5 [32.9 kB]
Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.18.0-00 [19.4 MB]
Get:3 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubeadm amd64 1.14.10-00 [8,155 kB]
Get:4 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubernetes-cni amd64 0.7.5-00 [6,473 kB]
Fetched 34.1 MB in 2s (13.6 MB/s)
Selecting previously unselected package conntrack.
(Reading database ... 97656 files and directories currently installed.)
Preparing to unpack .../conntrack_1%3a1.4.4+snapshot20161117-5_amd64.deb ...
Unpacking conntrack (1:1.4.4+snapshot20161117-5) ...
Preparing to unpack .../kubelet_1.18.0-00_amd64.deb ...
Unpacking kubelet (1.18.0-00) over (1.13.2-00) ...
Preparing to unpack .../kubeadm_1.14.10-00_amd64.deb ...
Unpacking kubeadm (1.14.10-00) over (1.13.2-00) ...
Preparing to unpack .../kubernetes-cni_0.7.5-00_amd64.deb ...
Unpacking kubernetes-cni (0.7.5-00) over (0.6.0-00) ...
Setting up conntrack (1:1.4.4+snapshot20161117-5) ...
Setting up kubernetes-cni (0.7.5-00) ...
Setting up kubelet (1.18.0-00) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up kubeadm (1.14.10-00) ...
正如您在此输出中看到的,kubelet
已更新到最新版本,因为它是kubeadm
. 现在我的主节点NotReady
和你一样:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubeadm-lab-0 NotReady master 7m v1.18.0
kubeadm-lab-1 Ready <none> 3m52s v1.13.2
kubeadm-lab-2 Ready <none> 3m44s v1.13.2
如何解决? 要解决这种情况,您必须降级一些错误升级的软件包:
$ sudo apt-get install -y \
--allow-downgrades \
--allow-change-held-packages \
kubelet=1.13.2-00 \
kubeadm=1.13.2-00 \
kubectl=1.13.2-00 \
kubernetes-cni=0.6.0-00
运行此命令后,稍等片刻并检查您的节点:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubeadm-lab-0 Ready master 9m25s v1.13.2
kubeadm-lab-1 Ready <none> 6m17s v1.13.2
kubeadm-lab-2 Ready <none> 6m9s v1.13.2
如何成功升级呢?
您必须apt-get install
在运行它之前仔细检查它的影响,并确保您的包将升级到所需的版本。
在我的集群中,我在主节点中使用以下命令进行了升级:
$ sudo apt-mark unhold kubeadm kubelet && \
sudo apt-get update && \
sudo apt-get install -y kubeadm=1.14.10-00 kubelet=1.14.10-00 && \
sudo apt-mark hold kubeadm kubelet
我的主节点升级到所需版本:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubeadm-lab-0 Ready master 58m v1.14.10
kubeadm-lab-1 Ready <none> 55m v1.13.2
kubeadm-lab-2 Ready <none> 55m v1.13.2
现在,如果您运行 sudo kubeadm upgrade plan,我们将得到以下输出:
$ sudo kubeadm upgrade plan
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.13.12
[upgrade/versions] kubeadm version: v1.14.10
I0326 10:08:44.926849 21406 version.go:240] remote version is much newer: v1.18.0; falling back to: stable-1.14
[upgrade/versions] Latest stable version: v1.14.10
[upgrade/versions] Latest version in the v1.13 series: v1.13.12
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 2 x v1.13.2 v1.14.10
1 x v1.14.10 v1.14.10
Upgrade to the latest stable version:
COMPONENT CURRENT AVAILABLE
API Server v1.13.12 v1.14.10
Controller Manager v1.13.12 v1.14.10
Scheduler v1.13.12 v1.14.10
Kube Proxy v1.13.12 v1.14.10
CoreDNS 1.2.6 1.3.1
Etcd 3.2.24 3.3.10
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.14.10
_____________________________________________________________________
正如您在消息中看到的,我们需要在所有节点上升级 kubelet,因此我在其他 2 个节点上运行以下命令:
$ sudo apt-mark unhold kubeadm kubelet kubernetes-cni && \
sudo apt-get update && \
sudo apt-get install -y kubeadm=1.14.10-00 kubelet=1.14.10-00 && \
sudo apt-mark hold kubeadm kubelet kubernetes-cni
最后我继续:
$ sudo kubeadm upgrade apply v1.14.10
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.14.10". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
推荐阅读
- bash - $(command here) 和 `command here` 有什么区别
- javascript - 如何关闭打印窗口
- excel - 将文本读取为数字的公式
- javascript - 我们可以根据条件在 react.js 中的 map 中引入换行符(css)吗
- qmake - QMake / jom 强制额外的 MIDL 编译器在 RC 之前运行
- vue.js - 开始使用组件,尝试嵌套它们
- ansible - Ansible 脚本模块“创建:”不生成文件
- angular - 从 Observable 绑定数据时对象未定义
- javascript - CKEDITOR5 如何插入 youtube 视频
- c# - 清洁架构中的“用例交互器”和“服务”有什么区别?