cloud-foundry - 在 vSphere 上安装 BOSH Director 失败
问题描述
这是我为 PKS 安装的第一个 BOSH。环境:
- 带有 VCSA 6.5u2 的 vSphere 6.5,
- OpsMgr 2.2 构建 296
- bosh stemcell vsphere-ubuntu-trusty build 3586.25
- 使用平面 100.x 网络,不涉及路由/防火墙。
摘要- 部署 OpsMgr OVF 模板后,我正在配置和安装 BOSH Director。但是,它在仪表板中的“等待代理”处失败。查看 OpsMgr VM 中的“当前”日志显示它一直在尝试从 /dev/sr0 读取设置,因为 agent.json 将设置源指定为 CDROM。它找不到任何 CDROM,所以它失败了。
几个问题:
- 当我将 Ops Mgr 中所有 VM 的设置更改为“默认 BOSH 密码”时,如何登录到 BOSH 创建的 VM?
- /var/tempest/workspaces/default/deployments 下没有 bosh.yml。一些文档指出了这一点。所以我不知道它应用了什么设置。位置错了吗?
- 有没有办法改变 OpsMgr VM 使用的 stemcell?也许我可以尝试使用以前的版本?
- agent.json 是如何实际填充的?
- 关于解决此问题的任何建议?
以下所有日志/json:
GUI仪表板日志:
===== 2018-07-30 08:20:52 UTC Running "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"
Deployment manifest: '/var/tempest/workspaces/default/deployments/bosh.yml'
Deployment state: '/var/tempest/workspaces/default/deployments/bosh-state.json'
Started validating
Validating release 'bosh'... Finished (00:00:00)
Validating release 'bosh-vsphere-cpi'... Finished (00:00:00)
Validating release 'uaa'... Finished (00:00:00)
Validating release 'credhub'... Finished (00:00:01)
Validating release 'bosh-system-metrics-server'... Finished (00:00:01)
Validating release 'os-conf'... Finished (00:00:00)
Validating release 'backup-and-restore-sdk'... Finished (00:00:04)
Validating release 'bpm'... Finished (00:00:02)
Validating cpi release... Finished (00:00:00)
Validating deployment manifest... Finished (00:00:00)
Validating stemcell... Finished (00:00:14)
Finished validating (00:00:26)
Started installing CPI
Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Finished (00:00:00)
Compiling package 'vsphere_cpi/e1a84e5bd82eb1abfe9088a2d547e2cecf6cf315'... Finished (00:00:00)
Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:00)
Installing packages... Finished (00:00:15)
Rendering job templates... Finished (00:00:06)
Installing job 'vsphere_cpi'... Finished (00:00:00)
Finished installing CPI (00:00:23)
Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3586.25'... Skipped [Stemcell already uploaded] (00:00:00)
Started deploying
Waiting for the agent on VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Failed (00:00:11)
Deleting VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Finished (00:00:10)
Creating VM for instance 'bosh/0' from stemcell 'sc-536fea79-cfa6-46a9-a53e-9de19505216f'... Finished (00:00:12)
Waiting for the agent on VM 'vm-fb90eee8-f3ac-45b7-95d3-4e8483c91a5c' to be ready... Failed (00:09:59)
Failed deploying (00:10:38)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)
Deploying:
Creating instance 'bosh/0':
Waiting until instance is ready:
Post https://vcap:<redacted>@192.168.100.201:6868/agent: dial tcp 192.168.100.201:6868: connect: no route to host
Exit code 1
===== 2018-07-30 08:32:20 UTC Finished "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"; Duration: 688s; Exit Status: 1
Exited with 1.
bosh_state.json
ubuntu@opsmanager-2-2:~$ sudo cat /var/tempest/workspaces/default/deployments/bosh-state.json
{
"director_id": "851f70ef-7c4b-4c65-73ed-d382ad3df1b7",
"installation_id": "f29df8af-7141-4aff-5e52-2d109a84cd84",
"current_vm_cid": "vm-87b3299a-a994-4544-8043-032ce89d685b",
"current_stemcell_id": "dcca340c-d612-4098-7c90-479193fa9090",
"current_disk_id": "",
"current_release_ids": [],
"current_manifest_sha": "",
"disks": null,
"stemcells": [
{
"id": "dcca340c-d612-4098-7c90-479193fa9090",
"name": "bosh-vsphere-esxi-ubuntu-trusty-go_agent",
"version": "3586.25",
"cid": "sc-536fea79-cfa6-46a9-a53e-9de19505216f"
}
],
"releases": []
代理.json
ubuntu@opsmanager-2-2:~$ sudo cat /var/vcap/bosh/agent.json
{
"Platform": {
"Linux": {
"DevicePathResolutionType": "scsi"
}
},
"Infrastructure": {
"Settings": {
"Sources": [
{
"Type": "CDROM",
"FileName": "env"
}
]
}
}
}
ubuntu@opsmanager-2-2:~$
最后是当前的BOSH日志
/var/vcap/bosh/log/current
2018-07-30_08:42:22.69934 [main] 2018/07/30 08:42:22 DEBUG - Starting agent
2018-07-30_08:42:22.69936 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/agent.json
2018-07-30_08:42:22.69937 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69937 ********************
2018-07-30_08:42:22.69938 {
2018-07-30_08:42:22.69938 "Platform": {
2018-07-30_08:42:22.69939 "Linux": {
2018-07-30_08:42:22.69939
2018-07-30_08:42:22.69939 "DevicePathResolutionType": "scsi"
2018-07-30_08:42:22.69939 }
2018-07-30_08:42:22.69939 },
2018-07-30_08:42:22.69939 "Infrastructure": {
2018-07-30_08:42:22.69940 "Settings": {
2018-07-30_08:42:22.69940 "Sources": [
2018-07-30_08:42:22.69940 {
2018-07-30_08:42:22.69940 "Type": "CDROM",
2018-07-30_08:42:22.69940 "FileName": "env"
2018-07-30_08:42:22.69940 }
2018-07-30_08:42:22.69941 ]
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941
2018-07-30_08:42:22.69941 ********************
2018-07-30_08:42:22.69943 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_version
2018-07-30_08:42:22.69944 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69944 3586.25
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69945 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_git_sha1
2018-07-30_08:42:22.69946 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69946 ********************
2018-07-30_08:42:22.69946 dbbb73800373356315a4c16ee40d2db3189bf2db
2018-07-30_08:42:22.69947 ********************
2018-07-30_08:42:22.69948 [App] 2018/07/30 08:42:22 INFO - Running on stemcell version '3586.25' (git: dbbb73800373356315a4c16ee40d2db3189bf2db)
2018-07-30_08:42:22.69949 [File System] 2018/07/30 08:42:22 DEBUG - Checking if file exists /var/vcap/bosh/agent_state.json
2018-07-30_08:42:22.69950 [File System] 2018/07/30 08:42:22 DEBUG - Stat '/var/vcap/bosh/agent_state.json'
2018-07-30_08:42:22.69951 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Running command 'bosh-agent-rc'
2018-07-30_08:42:22.70116 [unlimitedRetryStrategy] 2018/07/30 08:42:22 DEBUG - Making attempt #0
2018-07-30_08:42:22.70117 [DelayedAuditLogger] 2018/07/30 08:42:22 DEBUG - Starting logging to syslog...
2018-07-30_08:42:22.70181 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stdout:
2018-07-30_08:42:22.70182 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stderr:
2018-07-30_08:42:22.70183 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Successful: true (0)
2018-07-30_08:42:22.70184 [settingsService] 2018/07/30 08:42:22 DEBUG - Loading settings from fetcher
2018-07-30_08:42:22.70185 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - Kicking device, attempt 0 of 5
2018-07-30_08:42:22.70187 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.20204 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 1 of 5
2018-07-30_08:42:23.20206 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.70217 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 2 of 5
2018-07-30_08:42:23.70220 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.20229 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 3 of 5
2018-07-30_08:42:24.20294 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.70249 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 4 of 5
2018-07-30_08:42:24.70253 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20317 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20320 [ConcreteUdevDevice] 2018/07/30 08:42:25 ERROR - Failed to red byte from device: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.20321 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Settling UdevDevice
2018-07-30_08:42:25.20322 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Running command 'udevadm settle'
2018-07-30_08:42:25.20458 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stdout:
2018-07-30_08:42:25.20460 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stderr:
2018-07-30_08:42:25.20461 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Successful: true (0)
2018-07-30_08:42:25.20462 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 0 out of 5
2018-07-30_08:42:25.20463 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20464 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.70473 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 1 out of 5
2018-07-30_08:42:25.70476 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.70477 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.20492 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 2 out of 5
2018-07-30_08:42:26.20496 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.20497 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.70509 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 3 out of 5
2018-07-30_08:42:26.70512 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.70513 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.20530 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ensuring Device Readable, Attempt 4 out of 5
2018-07-30_08:42:27.20533 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.20534 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70554 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.70557 [settingsService] 2018/07/30 08:42:27 ERROR - Failed loading settings via fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70559 [settingsService] 2018/07/30 08:42:27 ERROR - Failed reading settings from file Opening file /var/vcap/bosh/settings.json: open /var/vcap/bosh/settings.json: no such file or directory
2018-07-30_08:42:27.70560 [main] 2018/07/30 08:42:27 ERROR - App setup Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70561 [main] 2018/07/30 08:42:27 ERROR - Agent exited with error: Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.71258 [main] 2018/07/30 08:42:27 DEBUG - Starting agent
<and this whole block just keeps repeating>
解决方案
当我将 Ops Mgr 中所有 VM 的设置更改为“默认 BOSH 密码”时,如何登录到 BOSH 创建的 VM?
这不是一个好主意。默认密码是众所周知的,您几乎应该总是使用随机生成的密码。老实说,我不确定为什么这甚至是一种选择。唯一想到的可能是一些极其罕见的故障排除方案。
也就是说,如果您需要手动访问虚拟机,您可以通过 Ops Manager 安全地获取随机生成的密码。您还可以通过 安全地访问虚拟机bosh ssh
,并自动处理凭据。即使对于故障排除,您通常也不需要该选项。
/var/tempest/workspaces/default/deployments 下没有 bosh.yml。一些文档指出了这一点。所以我不知道它应用了什么设置。位置错了吗?
位置正确,但文件包含敏感信息,因此 Ops Manager 在使用完毕后立即将其删除。
如果您想查看文件的内容,最简单的方法是导航到https://ops-man-fqdn/debug/files
您可以看到所有配置文件,包括您的bosh.yml
. 困难的方法是在部署进行时查看上面的文件夹,您会看到该文件存在很短的时间。您可以在该窗口中制作副本。硬方法的唯一优点是您将获得实际文件,而调试端点显示一个带有敏感信息编辑的文件。
有没有办法改变 OpsMgr VM 使用的 stemcell?也许我可以尝试使用以前的版本?
我不认为这是干细胞的问题。有很多人使用这些并且没有这个问题。如果在干细胞中发现类似这样的更大问题,您会在 Pivotal Network 上看到通知,并且 Pivotal 将发布一个新的、已修复的干细胞。
问题似乎还在于 VM 如何接收其初始引导配置。我建议在弄乱干细胞之前多研究一下。见下文。
agent.json 是如何实际填充的?
信不信由你,对于 vSphere 环境,该文件是从附加到 VM 的假 CD-ROM 中读取的。文档不多,但在此处的 BOSH 文档中简要提及。
https://bosh.io/docs/cpi-api-v1-method/create-vm/#agent-settings
关于解决此问题的任何建议?
看看就明白为什么不能挂载CD-ROM了。BOSH 需要它来获得它的引导配置,所以你需要让它工作。如果您的 vSphere 环境中存在阻止挂载 CD-ROM 的内容,则需要对其进行修改以允许挂载 CD-ROM。
如果 vSphere 端没有任何内容,我认为下一步是检查下的标准系统日志/var/log
并dmesg
输出,以查看是否有任何错误或线索说明为什么无法加载/读取 CD-ROM。
最后,尝试进行一些手动测试以从 CD-ROM 安装和读取。首先在 vSphere 客户端中查看其中一个 BOSH 部署的虚拟机,查看硬件设置并确保连接了 CD-ROM。它应该指向与env.iso
数据存储上的 VM 位于同一文件夹中的文件。如果已连接并连接,请启动 VM 并尝试安装 CD-ROM。您应该能够看到该驱动器上的 BOSH 配置文件。
希望有帮助!
推荐阅读
- pandas-datareader - pandas_datareader 不提取 TSP 数据
- rust - 是否可以在链接到其他库而不生成货物项目的情况下将单个 rust 文件作为脚本运行
- linux - 如何连接到 SQLPLUS 并在 ssh unix 中运行 SQL 脚本?
- windows - 尝试在 Rstudio 中安装软件包时出错
- javascript - 通过 id javascript 删除数组中的对象
- apache-nifi - 无法连接到 MSSQL 到 Nifi
- laravel - 传递给多对多关系的单个实例的表单和编辑数据透视表
- drupal - 如果未选中则提交值 true,如果选中则提交 false(从 webform 到 salesforce)
- php - 无法从生产中的 Laravel 应用程序发送电子邮件
- javascript - 如何在 html 模板中从 Django-object 传递 JavaScript 变量?