首页 > 解决方案 > 如何找出 docker 容器不再运行的原因?

问题描述

在我的一台 AWS 服务器上,我手动启动了一个运行willnorris/imageproxy的分离 Docker 容器。没有任何警告,它似乎在几天后下降,没有明显的(外部)原因。我检查了容器日志和系统日志,但一无所获。

我怎样才能找出问题所在(每次都会发生这种情况)?

我是这样开始的:

ubuntu@local:~ $ ssh ubuntu@my_aws_box
ubuntu@aws_box:~ $ docker run -dp 8081:8080 willnorris/imageproxy -addr 0.0.0.0:8080

通常,当它似乎已经崩溃时,我会这样做:

ubuntu@aws_box:~$ docker ps -a
CONTAINER ID   IMAGE                                                              COMMAND                  CREATED        STATUS                    PORTS                                                                            NAMES
de63701bbc82   willnorris/imageproxy                                              "/app/imageproxy -ad…"   10 days ago    Exited (2) 7 days ago                                                                                      frosty_shockley

ubuntu@aws_box:~$ docker logs de63701bbc82
imageproxy listening on 0.0.0.0:8080
2021/08/04 00:46:42 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:38568: write: broken pipe
2021/08/04 00:46:42 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:38572: write: broken pipe
2021/08/04 01:29:18 invalid request URL: malformed URL "/jars": too few path segments
2021/08/04 01:29:18 invalid request URL: malformed URL "/service/extdirect": must provide absolute remote URL
2021/08/04 11:09:49 invalid request URL: malformed URL "/jars": too few path segments
2021/08/04 11:09:49 invalid request URL: malformed URL "/service/extdirect": must provide absolute remote URL
2021/08/04 13:04:33 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:41036: write: broken pipe

如您所见,日志并没有告诉我崩溃的任何信息,而我唯一需要了解的是退出状态:Exited (2) 7 days ago

标签: docker

解决方案


由于这个出口似乎起源于容器/Docker 之外,我需要找到正确的日志。一个链接到的问题(本质上使它成为一个骗局)暗示要检查journaldunix 系统。做journald -u docker(基本上是为 docker grepping 日志)表明 Docker 容器在 8 月 6 日被杀死:

Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.544825959Z" level=info msg="Processing signal 'terminated'"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.836744355Z" level=info msg="ignoring event" container=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.837480333Z" level=info msg="shim disconnected" id=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.840764380Z" level=warning msg="cleaning up after shim disconnected" id=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5 namespace=moby
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.840787254Z" level=info msg="cleaning up dead shim"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.868008333Z" level=info msg="ignoring event" container=709e057de026ff11f783121c839c56938ea79dcd5965be1546cd6931beb5a903 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.868091089Z" level=info msg="ignoring event" container=9219e652436aae8016145bf3e0681ff1bb7046f230338d8ab79f9ced9532e342 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.868916377Z" level=info msg="shim disconnected" id=9219e652436aae8016145bf3e0681ff1bb7046f230338d8ab79f9ced9532e342
A
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.068939160Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.069763813Z" level=info msg="Daemon shutdown complete"
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.070022944Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd names
pace=plugins.moby
Aug 06 06:06:51 ip-192-168-3-117 systemd[1]: Stopped Docker Application Container Engine.
Aug 06 06:06:51 ip-192-168-3-117 systemd[1]: Starting Docker Application Container Engine...

现在,是什么杀死了它?为了弄清楚这一点,我不需要过滤掉前面的事件,所以我只是做了journald | grep 'Aug 06'并在前面的事件之前找到了这些行:

Aug 06 05:56:01 ip-192-168-3-117 systemd[1]: Starting Daily apt download activities...
Aug 06 05:56:11 ip-192-168-3-117 systemd[1]: Started Daily apt download activities.
Aug 06 06:06:39 ip-192-168-3-117 systemd[1]: Starting Daily apt upgrade and clean activities...
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Starting Message of the Day...
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:49 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:49 ip-192-168-3-117 systemd[1]: Stopping Docker Application Container Engine...

所以这基本上是由升级 Docker 守护进程并杀死旧的 cron 作业引起的!因为我没有--restart=always,所以在守护进程重生后容器没有重新启动。


推荐阅读