首页 > 解决方案 > Kubernetes master 升级失败后卡在 NotReady

问题描述

我有 1.13.2 版本的 K8S 集群,我想升级到 1.17.x 版本(最新的 1.17)。

我查看了官方说明:https ://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/指出我需要一次升级一个未成年人,即1.14,然后是1.15、1.16和才到1.17。

我做了所有准备(禁用交换),按文档运行所有内容,确定最新的 1.14 是 1.14.10。

当我跑的时候:

apt-mark unhold kubeadm kubelet && \
 apt-get update && apt-get install -y kubeadm=1.14.10-00 && \
apt-mark hold kubeadm

出于某种原因,似乎kubectl也下载了 v1.18。

我继续并尝试运行sudo kubeadm upgrade plan,但失败并出现以下错误:

[perflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/health] FATAL: [preflight] Some fatal errors occurred:
    [ERROR ControlPlaneNodesReady]: there are Notready control-planes in the cluster: [<name of master>]
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

运行时kubectl get nodes,它说VERSIONmaster 确实是NotReady1.18.0 版本,而工人当然是 v1.13.2 和Ready(未更改)。

如何修复我的集群?

当我尝试升级时,我做错了什么?

标签: kubernetes

解决方案


我在我的实验室中重现了你的问题,发生的事情是你不小心升级了比你想要的更多。更具体地说,您kubelet在主节点(控制平面)中升级了包。

所以这是我的健康集群版本1.13.2

$ kubectl get nodes
NAME            STATUS   ROLES    AGE     VERSION
kubeadm-lab-0   Ready    master   9m25s   v1.13.2
kubeadm-lab-1   Ready    <none>   6m17s   v1.13.2
kubeadm-lab-2   Ready    <none>   6m9s    v1.13.2

现在我会像你一样kubeadm松开:kubelet

$ sudo apt-mark unhold kubeadm kubelet
Canceled hold on kubeadm.
Canceled hold on kubelet.

最后我将升级kubeadm1.14.1

$ sudo apt-get install kubeadm=1.14.10-00
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  conntrack kubelet kubernetes-cni
The following NEW packages will be installed:
  conntrack
The following packages will be upgraded:
  kubeadm kubelet kubernetes-cni
3 upgraded, 1 newly installed, 0 to remove and 8 not upgraded.
Need to get 34.1 MB of archives.
After this operation, 7,766 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:2 http://deb.debian.org/debian stretch/main amd64 conntrack amd64 1:1.4.4+snapshot20161117-5 [32.9 kB]
Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.18.0-00 [19.4 MB]
Get:3 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubeadm amd64 1.14.10-00 [8,155 kB]
Get:4 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubernetes-cni amd64 0.7.5-00 [6,473 kB]
Fetched 34.1 MB in 2s (13.6 MB/s)         
Selecting previously unselected package conntrack.
(Reading database ... 97656 files and directories currently installed.)
Preparing to unpack .../conntrack_1%3a1.4.4+snapshot20161117-5_amd64.deb ...
Unpacking conntrack (1:1.4.4+snapshot20161117-5) ...
Preparing to unpack .../kubelet_1.18.0-00_amd64.deb ...
Unpacking kubelet (1.18.0-00) over (1.13.2-00) ...
Preparing to unpack .../kubeadm_1.14.10-00_amd64.deb ...
Unpacking kubeadm (1.14.10-00) over (1.13.2-00) ...
Preparing to unpack .../kubernetes-cni_0.7.5-00_amd64.deb ...
Unpacking kubernetes-cni (0.7.5-00) over (0.6.0-00) ...
Setting up conntrack (1:1.4.4+snapshot20161117-5) ...
Setting up kubernetes-cni (0.7.5-00) ...
Setting up kubelet (1.18.0-00) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up kubeadm (1.14.10-00) ...

正如您在此输出中看到的,kubelet已更新到最新版本,因为它是kubeadm. 现在我的主节点NotReady和你一样:

$ kubectl get nodes
NAME            STATUS     ROLES    AGE     VERSION
kubeadm-lab-0   NotReady   master   7m      v1.18.0
kubeadm-lab-1   Ready      <none>   3m52s   v1.13.2
kubeadm-lab-2   Ready      <none>   3m44s   v1.13.2

如何解决? 要解决这种情况,您必须降级一些错误升级的软件包:

$ sudo apt-get install -y \
--allow-downgrades \
--allow-change-held-packages \
kubelet=1.13.2-00 \
kubeadm=1.13.2-00 \
kubectl=1.13.2-00 \
kubernetes-cni=0.6.0-00

运行此命令后,稍等片刻并检查您的节点:

$ kubectl get nodes
NAME            STATUS   ROLES    AGE     VERSION
kubeadm-lab-0   Ready    master   9m25s   v1.13.2
kubeadm-lab-1   Ready    <none>   6m17s   v1.13.2
kubeadm-lab-2   Ready    <none>   6m9s    v1.13.2

如何成功升级呢?

您必须apt-get install在运行它之前仔细检查它的影响,并确保您的包将升级到所需的版本。

在我的集群中,我在主节点中使用以下命令进行了升级:

$ sudo apt-mark unhold kubeadm kubelet && \
sudo apt-get update && \
sudo apt-get install -y kubeadm=1.14.10-00 kubelet=1.14.10-00 && \
sudo apt-mark hold kubeadm kubelet

我的主节点升级到所需版本:

$ kubectl get nodes
NAME            STATUS   ROLES    AGE   VERSION
kubeadm-lab-0   Ready    master   58m   v1.14.10
kubeadm-lab-1   Ready    <none>   55m   v1.13.2
kubeadm-lab-2   Ready    <none>   55m   v1.13.2

现在,如果您运行 sudo kubeadm upgrade plan,我们将得到以下输出:

$ sudo kubeadm upgrade plan
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.13.12
[upgrade/versions] kubeadm version: v1.14.10
I0326 10:08:44.926849   21406 version.go:240] remote version is much newer: v1.18.0; falling back to: stable-1.14
[upgrade/versions] Latest stable version: v1.14.10
[upgrade/versions] Latest version in the v1.13 series: v1.13.12

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT        AVAILABLE
Kubelet     2 x v1.13.2    v1.14.10
            1 x v1.14.10   v1.14.10

Upgrade to the latest stable version:

COMPONENT            CURRENT    AVAILABLE
API Server           v1.13.12   v1.14.10
Controller Manager   v1.13.12   v1.14.10
Scheduler            v1.13.12   v1.14.10
Kube Proxy           v1.13.12   v1.14.10
CoreDNS              1.2.6      1.3.1
Etcd                 3.2.24     3.3.10

You can now apply the upgrade by executing the following command:

    kubeadm upgrade apply v1.14.10

_____________________________________________________________________

正如您在消息中看到的,我们需要在所有节点上升级 kubelet,因此我在其他 2 个节点上运行以下命令:

$ sudo apt-mark unhold kubeadm kubelet kubernetes-cni && \
sudo apt-get update && \
sudo apt-get install -y kubeadm=1.14.10-00 kubelet=1.14.10-00 && \
sudo apt-mark hold kubeadm kubelet kubernetes-cni

最后我继续:

$ sudo kubeadm upgrade apply v1.14.10
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.14.10". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

推荐阅读