ssh - Ansible 无法通过 SSH 连接到远程主机:“控制路径不存在”和“mux_client_read_packet: read header failed: Broken pipe”
问题描述
我正在尝试通过 SSH 连接到新配置的 EC2 实例。但是,Ansible 总是无法通过 SSH 连接到远程机器。
不幸的是,我已经用尽了我能想到的关于这个问题的所有资源/选项。我已经尝试过:人们在类似问题上建议的不同配置;使用不同版本的 Ansible;重新启动我的机器;审查我的文件一百万次以确保没有错别字。
这是剧本片段:
---
- name: Configure EC2 instance
hosts: "localhost"
connection: "local"
gather_facts: no
vars:
REGION: "us-east-2"
...
vars_files:
- secrets.yml
tasks:
- name: Provision EC2 instance
ec2:
aws_access_key: "{{ AWS_ACCESS_KEY_ID }}"
aws_secret_key: "{{ AWS_SECRET_ACCESS_KEY }}"
...
register: ec2
- name: Wait for SSH to come up
wait_for_connection:
delay: 10
timeout: 120
loop: "{{ ec2.instances }}"
- name: Add new instance public DNS to host group
add_host:
hostname: "{{ ec2.instances[0].public_dns_name }}"
groups: "ec2"
- name: SSH into EC2
hosts: "ec2"
connection: "ssh"
remote_user: "ubuntu"
gather_facts: yes
tasks:
- name: Wait for user data script to complete execution
wait_for:
path: /var/log/cloud-init-output.log
search_regex: AMI BUILD COMPLETE
delay: 15
timeout: 120
...
我的/etc/ansible/ansible.cfg
文件:
[defaults]
host_key_checking = False
private_key_file = /Users/dev/Projects/aws/keys/private-key.pem
stdout_callback = debug
log_path = /var/log/ansible/ansible.log
[ssh_connection]
transfer_method = scp
ssh_args = -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50
scp_if_ssh = True
[persistent_connection]
connect_timeout = 300
执行上述剧本的命令:
sudo ANSIBLE_DEBUG=1 ansible-playbook infra/aws/ansible/ec2-provisioning.yml -vvvvv --ask-vault-pass
该错误发生在“SSH 到 EC2”播放中,特别是在收集事实部分。这是该播放/任务的整个日志块:
2020-06-05 12:16:43,795 p=root u=9572 | PLAY [SSH into EC2] ****************************************************************************************************************************************************
2020-06-05 12:16:43,805 p=root u=9572 | TASK [Gathering Facts] *************************************************************************************************************************************************
2020-06-05 12:16:43,817 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> ESTABLISH SSH CONNECTION FOR USER: ubuntu
2020-06-05 12:16:43,818 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ansible.cfg set ssh_args: (-C)(-o)(ControlMaster=auto)(-o)(ControlPersist=200)(-o)(ConnectTimeout=30)(-o)(ServerAliveInterval=50)
2020-06-05 12:16:43,818 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
2020-06-05 12:16:43,819 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_PRIVATE_KEY_FILE/private_key_file/ansible_ssh_private_key_file set: (-o)(IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem")
2020-06-05 12:16:43,819 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ansible_password/ansible_ssh_password not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no)
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User="ubuntu")
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: PlayContext set ssh_common_args: ()
2020-06-05 12:16:43,821 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: PlayContext set ssh_extra_args: ()
2020-06-05 12:16:43,822 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/Users/dev/.ansible/cp/4a306014bf)
2020-06-05 12:16:43,822 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50 -o StrictHostKeyChecking=no -o 'IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o ControlPath=/Users/dev/.ansible/cp/4a306014bf ec2-3-23-59-101.us-east-2.compute.amazonaws.com '/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''
2020-06-05 12:16:45,132 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> (255, b'', b'OpenSSH_8.1p1, LibreSSL 2.7.3\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 47: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug1: Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist\r\ndebug2: resolving "ec2-3-23-59-101.us-east-2.compute.amazonaws.com" port 22\r\ndebug2: ssh_connect_direct\r\ndebug1: Connecting to ec2-3-23-59-101.us-east-2.compute.amazonaws.com [3.23.59.101] port 22.\r\ndebug2: fd 5 setting O_NONBLOCK\r\ndebug1: connect to address 3.23.59.101 port 22: Connection refused\r\nssh: connect to host ec2-3-23-59-101.us-east-2.compute.amazonaws.com port 22: Connection refused\r\n')
2020-06-05 12:16:45,137 p=root u=9572 | fatal: [ec2-3-23-59-101.us-east-2.compute.amazonaws.com]: UNREACHABLE! => {
"changed": false,
"unreachable": true
}
MSG:
Failed to connect to the host via ssh: OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist
debug2: resolving "ec2-3-23-59-101.us-east-2.compute.amazonaws.com" port 22
debug2: ssh_connect_direct
debug1: Connecting to ec2-3-23-59-101.us-east-2.compute.amazonaws.com [3.23.59.101] port 22.
debug2: fd 5 setting O_NONBLOCK
debug1: connect to address 3.23.59.101 port 22: Connection refused
ssh: connect to host ec2-3-23-59-101.us-east-2.compute.amazonaws.com port 22: Connection refused
2020-06-05 12:16:45,139 p=root u=9572 | PLAY RECAP *************************************************************************************************************************************************************
2020-06-05 12:16:45,140 p=root u=9572 | ec2-3-23-59-101.us-east-2.compute.amazonaws.com : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
2020-06-05 12:16:45,140 p=root u=9572 | localhost : ok=3 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
这里对我来说突出的部分是:Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist
.
我可以看到 Ansible 正在尝试执行的命令,即:
ssh -vvv -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50 -o StrictHostKeyChecking=no -o 'IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o ControlPath=/Users/dev/.ansible/cp/4a306014bf ec2-3-23-59-101.us-east-2.compute.amazonaws.com '/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''
如果我通过 SSH 调试直接从命令行运行它,SSH 会返回:
...
debug3: send packet: type 97
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: chan_shutdown_read (i0 o1 sock 3 wfd 3 efd -1 [closed])
debug2: channel 1: input open -> closed
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 3
debug3: channel 2: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o1/16 e[closed]/0 fd 3/3/-1 sock 3 cc -1)
#2 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/9 sock -1 cc -1)
debug2: channel 1: obuf empty
debug2: channel 1: chan_shutdown_write (i3 o1 sock 3 wfd 3 efd -1 [closed])
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 2
debug3: channel 1: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o3/0 e[closed]/0 fd 3/3/-1 sock 3 cc -1)
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0
debug2: set_control_persist_exit_time: schedule exit in 200 seconds
再次突出的部分是debug3: mux_client_read_packet: read header failed: Broken pipe
线条。如果我'/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''
从命令末尾删除该部分并从命令行再次运行它,它会成功连接。不幸的是,我不能告诉 Ansible 删除该部分命令。
非常感谢大家的帮助!我真的不知道下一步会是什么坚实的基础。我很感激任何建议/想法。
解决方案
我一直无法弄清楚这个问题的根本原因(如果我这样做了,我会更新),但是在 SSH 播放之前添加一个短暂的等待是目前一个可用的解决方法:
- name: Hard wait (30 seconds) before SSHing into EC2 instance
pause:
seconds: 30
它现在每次都有效。
推荐阅读
- python - 如何使用 pywinauto 安装应用程序 (setup.exe)
- reactjs - 将创建元素反应到 JSX
- php - 通过 phpmailer 模拟成功发送 ($mail->Send())
- javascript - JavaScript 中的“参数”关键字
- c# - 在运行时从配置文件写入和读取更新的 appSettings
- python - Google 和 Oauthlib - 范围已更改
- javascript - 带有无限循环的食物菜单滑块
- css - 使用 Android 在菜单上禁用橙色突出显示
- python - 向 locals() 显式添加内容有多糟糕?
- r - R从一个脚本打开套接字连接服务器并启动套接字连接客户端