dns - Kubernetes DNS 失败
问题描述
我正在尝试学习 kubernetes,并且我已经成功地在裸机上设置了一个集群(1 个节点),部署了一个服务并通过入口公开了它。
我尝试实现 traefik,让我们加密证书,但我无法让它工作,并且在调试时我注意到,我的 DNS 服务无法正常工作(39 次重新启动)。
我想我会尝试重新开始,因为我一直在玩,并且尝试使用法兰绒而不是印花布。
我现在有一个由以下命令创建的集群:
kubeadm init --pod-network-cidr=192.168.0.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
使用以下命令测试 DNS 会出错:
kubectl create -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/docs/tasks/administer-cluster/busybox.yaml
kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1
通过调查 pod 的状态,我看到 dns pod 正在重新启动
kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 1/1 Running 0 11m
kube-system etcd-kubernetes-master 1/1 Running 0 15m
kube-system kube-apiserver-kubernetes-master 1/1 Running 0 15m
kube-system kube-controller-manager-kubernetes-master 1/1 Running 0 15m
kube-system kube-dns-86f4d74b45-8tgff 3/3 Running 2 25m
kube-system kube-flannel-ds-6cd9h 1/1 Running 0 15m
kube-system kube-flannel-ds-h78ld 1/1 Running 0 13m
kube-system kube-proxy-95kkd 1/1 Running 0 13m
kube-system kube-proxy-lq7hx 1/1 Running 0 25m
kube-system kube-scheduler-kubernetes-master 1/1 Running 0 15m
DNS pod 日志显示以下内容:
kubectl logs kube-dns-86f4d74b45-8tgff dnsmasq -n kube-system
I0621 08:41:51.414587 1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0621 08:41:51.414709 1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0621 08:41:52.255053 1 nanny.go:119]
W0621 08:41:52.255074 1 nanny.go:120] Got EOF from stdout
I0621 08:41:52.256152 1 nanny.go:116] dnsmasq[10]: started, version 2.78 cachesize 1000
I0621 08:41:52.256216 1 nanny.go:116] dnsmasq[10]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0621 08:41:52.256245 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0621 08:41:52.256260 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0621 08:41:52.256275 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0621 08:41:52.256320 1 nanny.go:116] dnsmasq[10]: reading /etc/resolv.conf
I0621 08:41:52.256335 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0621 08:41:52.256350 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0621 08:41:52.256365 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0621 08:41:52.256379 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.53#53
I0621 08:41:52.256432 1 nanny.go:116] dnsmasq[10]: read /etc/hosts - 7 addresses
I0621 08:50:43.727968 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:50:53.750313 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:03.879573 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:13.887735 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:23.957996 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:34.016679 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:43.032107 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:51:53.076274 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:03.359643 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:13.434993 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:23.497330 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:33.591295 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:43.639024 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:52:53.681231 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:53:03.717874 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:53:13.794725 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:53:23.877015 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
I0621 08:53:33.974114 1 nanny.go:116] dnsmasq[10]: Maximum number of concurrent DNS queries reached (max: 150)
kubectl logs kube-dns-86f4d74b45-8tgff sidecar -n kube-system
I0621 08:41:57.464915 1 main.go:51] Version v1.14.8
I0621 08:41:57.464987 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0621 08:41:57.465029 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
I0621 08:41:57.468630 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
W0621 08:50:46.832282 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:37437->127.0.0.1:53: i/o timeout
W0621 08:50:55.772310 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:57714->127.0.0.1:53: i/o timeout
W0621 08:51:02.779876 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:38592->127.0.0.1:53: i/o timeout
W0621 08:51:09.795385 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:39941->127.0.0.1:53: i/o timeout
W0621 08:51:16.798735 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:41457->127.0.0.1:53: i/o timeout
W0621 08:51:23.802617 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:45709->127.0.0.1:53: i/o timeout
W0621 08:51:30.822081 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:37072->127.0.0.1:53: i/o timeout
W0621 08:51:37.826914 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:49924->127.0.0.1:53: i/o timeout
W0621 08:51:51.093275 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:51194->127.0.0.1:53: i/o timeout
W0621 08:51:58.203965 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:35781->127.0.0.1:53: i/o timeout
W0621 08:52:06.423002 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42763->127.0.0.1:53: i/o timeout
W0621 08:52:16.455821 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:36143->127.0.0.1:53: i/o timeout
W0621 08:52:23.496199 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:44195->127.0.0.1:53: i/o timeout
W0621 08:52:30.500081 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:48733->127.0.0.1:53: i/o timeout
W0621 08:52:37.519339 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:39179->127.0.0.1:53: i/o timeout
W0621 08:52:51.695822 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:51821->127.0.0.1:53: i/o timeout
W0621 08:52:58.739133 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:46358->127.0.0.1:53: i/o timeout
W0621 08:53:06.823714 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:49103->127.0.0.1:53: i/o timeout
W0621 08:53:16.866975 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:52782->127.0.0.1:53: i/o timeout
W0621 08:53:23.869540 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:52495->127.0.0.1:53: i/o timeout
W0621 08:53:30.882626 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:56134->127.0.0.1:53: i/o timeout
W0621 08:53:37.886811 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:55489->127.0.0.1:53: i/o timeout
W0621 08:53:46.023614 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:58523->127.0.0.1:53: i/o timeout
W0621 08:53:53.034985 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:56026->127.0.0.1:53: i/o timeout
W0621 08:54:00.041734 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42093->127.0.0.1:53: i/o timeout
W0621 08:54:07.050864 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:58731->127.0.0.1:53: i/o timeout
W0621 08:54:14.053858 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:34062->127.0.0.1:53: i/o timeout
W0621 08:54:21.076986 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:44293->127.0.0.1:53: i/o timeout
W0621 08:54:28.080808 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42738->127.0.0.1:53: i/o timeout
W0621 08:54:41.423864 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:58715->127.0.0.1:53: i/o timeout
似乎 dns 以某种方式被垃圾邮件发送,但我不知道应该如何进行。
解决方案
深入研究哈维尔的答案,我在这里找到了问题的解决方案: https ://github.com/kubernetes/kubeadm/issues/787
这些是解决该问题所需的命令:
sudo rm /etc/resolv.conf
sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf
推荐阅读
- javascript - 如果对象具有重复属性,则从列表中删除对象
- python - selenium web driver 无法读取单页架构中的网络调用,主要是后续页面(selenium/chrome web driver/Python)
- python - 我尝试使用烧瓶部署我的机器学习模型并得到错误 SystemExit: 1
- vba - 转发 Outlook 会议
- javascript - 仅在小屏幕上使用按钮更改 div 的高度
- batch-file - 无法使用带有特殊字符的批处理
- python - 在 Django 中创建模型时得到“无法解压不可迭代的 int 对象”
- c# - 在 C# 多线程环境中,是否也需要锁定可枚举对象的 getter 和 setter?
- python - csv 到 JSON 与 Python 中的数组对象
- python - 由于缺少函数输入,使用 Tensorflow optimizer.minimize() 最小化函数失败