kubernetes - Kubernetes CronJob 未退出
问题描述
我在 kubernetes 中运行一个 cronjob。Cronjob 已启动但未退出。pod 的状态总是在RUNNING。下面是日志
kubectl get pods
cronjob-1623253800-xnwwx 1/1 Running 0 13h
当我描述下面的工作时,会注意到
kubectl describe job cronjob-1623300120
Name: cronjob-1623300120
Namespace: cronjob
Selector: xxxxx
Labels: xxxxx
Annotations: <none>
Controlled By: CronJob/cronjob
Parallelism: 1
Completions: 1
Start Time: Thu, 9 Jun 2021 10:12:03 +0530
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=cronjob
controller-xxxx
job-name=cronjob-1623300120
Containers:
plannercronjob:
Image: xxxxxxxxxxxxx
Port: <none>
Host Port: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 13h job-controller Created pod: cronjob-1623300120
我注意到 Pod 状态:1 运行/0 成功/0 失败。这意味着当代码返回零时,作业成功/失败。那是对的吗 ?。
当我使用执行命令进入 pod
kubectl exec --stdin --tty cronjob-1623253800-xnwwx -n cronjob -- /bin/bash
root@cronjob-1623253800-xnwwx:/# ps ax| grep python
1 ? Ssl 0:01 python -m sfit.src.app
18 pts/0 S+ 0:00 grep python
我发现python进程仍在运行。这是代码问题死锁还是其他问题。
pod describe
Name: cronjob-1623302220-xnwwx
Namespace: default
Priority: 0
Node: aks-agentpool-xxxxvmss000000/10.240.0.4
Start Time: Thu, 9 Jun 2021 10:47:02 +0530
Labels: app=cronjob
controller-uid=xxxxxx
job-name=cronjob-1623302220
Annotations: <none>
Status: Running
IP: 10.244.1.30
IPs:
IP: 10.244.1.30
Controlled By: Job/cronjob-1623302220
Containers:
plannercronjob:
Container ID: docker://xxxxxxxxxxxxxxxx
Image: xxxxxxxxxxx
Image ID: docker-xxxx
Port: <none>
Host Port: <none>
State: Running
Started: Thu, 9 Jun 2021 10:47:06 +0530
Ready: True
Restart Count: 0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-97xzv (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-97xzv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-97xzv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13h default-scheduler Successfully assigned cronjob/cronjob-1623302220-xnwwx to aks-agentpool-xxx-vmss000000
Normal Pulling 13h kubelet, aks-agentpool-xxx-vmss000000 Pulling image "xxxx.azurecr.io/xxx:1.1.1"
Normal Pulled 13h kubelet, aks-agentpool-xxx-vmss000000 Successfully pulled image "xxx.azurecr.io/xx:1.1.1"
Normal Created 13h kubelet, aks-agentpool-xxx-vmss000000 Created container cronjob
Normal Started 13h kubelet, aks-agentpool-xxx-vmss000000 Started container cronjob
@KrishnaChaurasia。我在我的系统中运行 docker 映像。我的 python 代码中有一些错误。但它是退出错误。但是在 kubernetes 中它没有退出也没有停止
docker run xxxxx/cronjob:1
File "/usr/local/lib/python3.8/site-packages/azure/core/pipeline/transport/_requests_basic.py", line 261, in send
raise error
azure.core.exceptions.ServiceRequestError: <urllib3.connection.HTTPSConnection object at 0x7f113f6480a0>: Failed to establish a new connection: [Errno -2] Name or service not known
回声$?1
解决方案
如果您看到您的 pod 一直在运行并且从未完成,请尝试添加 startatingDeadlineSeconds。
推荐阅读
- react-native - 当反应本机应用程序在后台时如何保持套接字打开?
- linux - 收集linux上所有连接的设备
- php - html表格中的按钮未对齐
- python - 如何删除有关使用 Word2vec gensim\matutils.py:737 的 gensim 警告
- macos - 将自定义文件添加到要使用 productbuild/pkgbuild 安装在特定路径上的软件包?
- macos - 分别使用 .fill 和 .blit 来解决这两个问题时,Python 会冻结和崩溃
- c++ - 在编辑控件中输入无效输入时如何显示不显眼的消息
- javascript - 哪种方式更适合将函数引用发送到反应组件?
- java - 无法将资源添加到 Maven 中的 jar
- java - 使用 GridLayoutManager 和 RecyclerView 更改列数以实现自动调整屏幕