kubernetes - GKE Kubernetes LoadBalancer 返回由对等方重置的连接
问题描述
我的集群遇到了一个奇怪的问题
在我的集群中,我有一个部署和一个负载平衡器服务,它暴露了这个部署,它就像一个魅力,但突然负载平衡器开始返回一个错误:
curl: (56) Recv failure: Connection reset by peer
pod 和负载均衡器正在运行并且日志中没有错误时显示错误
我已经尝试过的:
- 删除 pod
- 从头开始重新部署服务+部署,但问题仍然存在
我的服务 yaml:
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/neg: '{"ingress":true}'
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"RELEASE-NAME","app.kubernetes.io/name":"APP-NAME","app.kubernetes.io/version":"latest"},"name":"APP-NAME","namespace":"namespacex"},"spec":{"ports":[{"name":"web","port":3000}],"selector":{"app.kubernetes.io/instance":"RELEASE-NAME","app.kubernetes.io/name":"APP-NAME"},"type":"LoadBalancer"}}
creationTimestamp: "2021-08-03T07:55:00Z"
finalizers:
- service.kubernetes.io/load-balancer-cleanup
labels:
app.kubernetes.io/instance: RELEASE-NAME
app.kubernetes.io/name: APP-NAME
app.kubernetes.io/version: latest
name: APP-NAME
namespace: namespacex
resourceVersion: "14583904"
uid: 7fb4d7e6-4316-44e5-8f9b-7a466bc776da
spec:
clusterIP: 10.4.18.36
clusterIPs:
- 10.4.18.36
externalTrafficPolicy: Cluster
ports:
- name: web
nodePort: 30970
port: 3000
protocol: TCP
targetPort: 3000
selector:
app.kubernetes.io/instance: RELEASE-NAME
app.kubernetes.io/name: APP-NAME
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: xx.xxx.xxx.xxx
我的部署 yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: APP-NAME
labels:
app.kubernetes.io/name: APP-NAME
app.kubernetes.io/instance: RELEASE-NAME
app.kubernetes.io/version: "latest"
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: APP-NAME
app.kubernetes.io/instance: RELEASE-NAME
template:
metadata:
annotations:
checksum/config: 5e6ff0d6fa64b90b0365e9f3939cefc0a619502b32564c4ff712067dbe44ab90
checksum/secret: 76e0a1351da90c0cef06851e3aa9e7c80b415c29b11f473d4a2520ade9c892ce
labels:
app.kubernetes.io/name: APP-NAME
app.kubernetes.io/instance: RELEASE-NAME
spec:
serviceAccountName: APP-NAME
containers:
- name: APP-NAME
image: 'docker.io/xxxxxxxx:latest'
imagePullPolicy: "Always"
ports:
- name: http
containerPort: 3000
livenessProbe:
httpGet:
path: /balancer/
port: http
readinessProbe:
httpGet:
path: /balancer/
port: http
env:
...
volumeMounts:
- name: config-volume
mountPath: /home/app/config/
resources:
limits:
cpu: 400m
memory: 256Mi
requests:
cpu: 400m
memory: 256Mi
volumes:
- name: config-volume
configMap:
name: app-config
imagePullSecrets:
- name: secret
解决方案
在我的情况下,问题变成了网络组件(如 FW)在无明显原因使集群“不安全”变暗后阻塞了出站连接
所以本质上这不是 K8s 问题,而是 IT 问题
推荐阅读
- python - 从输入流保存临时 pdf 文件的 Azure 函数已损坏
- angular - 多 if 语句在我的表达式中使用简写 if 语句在角度
- github - 未能分解多副本变更集以生成差异
- c# - 为 ImageView 添加边框
- laravel - 备份 laravel mysqldump
- angular - Angular Formarray选择下拉问题
- java - 在 EJB3.2 部署描述符中配置拦截器
- c# - 删除动态创建的组件总是从列表中删除最后一项
- android - Android 应用程序抛出“Illegal Start of Type”
- vue.js - 如何在vue js中将一个函数从一个vue文件调用到另一个vue文件