首页 > 解决方案 > 在 vSphere 上安装 BOSH Director 失败

问题描述

这是我为 PKS 安装的第一个 BOSH。环境:

摘要- 部署 OpsMgr OVF 模板后,我正在配置和安装 BOSH Director。但是,它在仪表板中的“等待代理”处失败。查看 OpsMgr VM 中的“当前”日志显示它一直在尝试从 /dev/sr0 读取设置,因为 agent.json 将设置源指定为 CDROM。它找不到任何 CDROM,所以它失败了。

几个问题:

  1. 当我将 Ops Mgr 中所有 VM 的设置更改为“默认 BOSH 密码”时,如何登录到 BOSH 创建的 VM?
  2. /var/tempest/workspaces/default/deployments 下没有 bosh.yml。一些文档指出了这一点。所以我不知道它应用了什么设置。位置错了吗?
  3. 有没有办法改变 OpsMgr VM 使用的 stemcell?也许我可以尝试使用以前的版本?
  4. agent.json 是如何实际填充的?
  5. 关于解决此问题的任何建议?

以下所有日志/json:

GUI仪表板日志:

===== 2018-07-30 08:20:52 UTC Running "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"
Deployment manifest: '/var/tempest/workspaces/default/deployments/bosh.yml'
Deployment state: '/var/tempest/workspaces/default/deployments/bosh-state.json'

Started validating
Validating release 'bosh'... Finished (00:00:00)
Validating release 'bosh-vsphere-cpi'... Finished (00:00:00)
Validating release 'uaa'... Finished (00:00:00)
Validating release 'credhub'... Finished (00:00:01)
Validating release 'bosh-system-metrics-server'... Finished (00:00:01)
Validating release 'os-conf'... Finished (00:00:00)
Validating release 'backup-and-restore-sdk'... Finished (00:00:04)
Validating release 'bpm'... Finished (00:00:02)
Validating cpi release... Finished (00:00:00)
Validating deployment manifest... Finished (00:00:00)
Validating stemcell... Finished (00:00:14)
Finished validating (00:00:26)

Started installing CPI
Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Finished (00:00:00)
Compiling package 'vsphere_cpi/e1a84e5bd82eb1abfe9088a2d547e2cecf6cf315'... Finished (00:00:00)
Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:00)
Installing packages... Finished (00:00:15)
Rendering job templates... Finished (00:00:06)
Installing job 'vsphere_cpi'... Finished (00:00:00)
Finished installing CPI (00:00:23)

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3586.25'... Skipped [Stemcell already uploaded] (00:00:00)

Started deploying
Waiting for the agent on VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Failed (00:00:11)
Deleting VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Finished (00:00:10)
Creating VM for instance 'bosh/0' from stemcell 'sc-536fea79-cfa6-46a9-a53e-9de19505216f'... Finished (00:00:12)
Waiting for the agent on VM 'vm-fb90eee8-f3ac-45b7-95d3-4e8483c91a5c' to be ready... Failed (00:09:59)
Failed deploying (00:10:38)

Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Deploying:
Creating instance 'bosh/0':
    Waiting until instance is ready:
    Post https://vcap:<redacted>@192.168.100.201:6868/agent: dial tcp 192.168.100.201:6868: connect: no route to host

Exit code 1
===== 2018-07-30 08:32:20 UTC Finished "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"; Duration: 688s; Exit Status: 1
Exited with 1.

bosh_state.json

ubuntu@opsmanager-2-2:~$ sudo cat /var/tempest/workspaces/default/deployments/bosh-state.json

{
    "director_id": "851f70ef-7c4b-4c65-73ed-d382ad3df1b7",
    "installation_id": "f29df8af-7141-4aff-5e52-2d109a84cd84",
    "current_vm_cid": "vm-87b3299a-a994-4544-8043-032ce89d685b",
    "current_stemcell_id": "dcca340c-d612-4098-7c90-479193fa9090",
    "current_disk_id": "",
    "current_release_ids": [],
    "current_manifest_sha": "",
    "disks": null,
    "stemcells": [
        {
            "id": "dcca340c-d612-4098-7c90-479193fa9090",
            "name": "bosh-vsphere-esxi-ubuntu-trusty-go_agent",
            "version": "3586.25",
            "cid": "sc-536fea79-cfa6-46a9-a53e-9de19505216f"
        }
    ],
    "releases": []

代理.json

ubuntu@opsmanager-2-2:~$ sudo cat /var/vcap/bosh/agent.json
{
"Platform": {
    "Linux": {

    "DevicePathResolutionType": "scsi"
    }
},
"Infrastructure": {
    "Settings": {
    "Sources": [
        {
        "Type": "CDROM",
        "FileName": "env"
        }
    ]
    }
}
}
ubuntu@opsmanager-2-2:~$

最后是当前的BOSH日志

/var/vcap/bosh/log/current


2018-07-30_08:42:22.69934 [main] 2018/07/30 08:42:22 DEBUG - Starting agent
2018-07-30_08:42:22.69936 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/agent.json
2018-07-30_08:42:22.69937 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69937 ********************
2018-07-30_08:42:22.69938 {
2018-07-30_08:42:22.69938   "Platform": {
2018-07-30_08:42:22.69939     "Linux": {
2018-07-30_08:42:22.69939
2018-07-30_08:42:22.69939       "DevicePathResolutionType": "scsi"
2018-07-30_08:42:22.69939     }
2018-07-30_08:42:22.69939   },
2018-07-30_08:42:22.69939   "Infrastructure": {
2018-07-30_08:42:22.69940     "Settings": {
2018-07-30_08:42:22.69940       "Sources": [
2018-07-30_08:42:22.69940         {
2018-07-30_08:42:22.69940           "Type": "CDROM",
2018-07-30_08:42:22.69940           "FileName": "env"
2018-07-30_08:42:22.69940         }
2018-07-30_08:42:22.69941       ]
2018-07-30_08:42:22.69941     }
2018-07-30_08:42:22.69941   }
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941
2018-07-30_08:42:22.69941 ********************
2018-07-30_08:42:22.69943 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_version
2018-07-30_08:42:22.69944 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69944 3586.25
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69945 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_git_sha1
2018-07-30_08:42:22.69946 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69946 ********************
2018-07-30_08:42:22.69946 dbbb73800373356315a4c16ee40d2db3189bf2db
2018-07-30_08:42:22.69947 ********************
2018-07-30_08:42:22.69948 [App] 2018/07/30 08:42:22 INFO - Running on stemcell version '3586.25' (git: dbbb73800373356315a4c16ee40d2db3189bf2db)
2018-07-30_08:42:22.69949 [File System] 2018/07/30 08:42:22 DEBUG - Checking if file exists /var/vcap/bosh/agent_state.json
2018-07-30_08:42:22.69950 [File System] 2018/07/30 08:42:22 DEBUG - Stat '/var/vcap/bosh/agent_state.json'
2018-07-30_08:42:22.69951 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Running command 'bosh-agent-rc'
2018-07-30_08:42:22.70116 [unlimitedRetryStrategy] 2018/07/30 08:42:22 DEBUG - Making attempt #0
2018-07-30_08:42:22.70117 [DelayedAuditLogger] 2018/07/30 08:42:22 DEBUG - Starting logging to syslog...
2018-07-30_08:42:22.70181 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stdout:
2018-07-30_08:42:22.70182 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stderr:
2018-07-30_08:42:22.70183 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Successful: true (0)
2018-07-30_08:42:22.70184 [settingsService] 2018/07/30 08:42:22 DEBUG - Loading settings from fetcher
2018-07-30_08:42:22.70185 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - Kicking device, attempt 0 of 5
2018-07-30_08:42:22.70187 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.20204 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 1 of 5
2018-07-30_08:42:23.20206 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.70217 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 2 of 5
2018-07-30_08:42:23.70220 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.20229 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 3 of 5
2018-07-30_08:42:24.20294 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.70249 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 4 of 5
2018-07-30_08:42:24.70253 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20317 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20320 [ConcreteUdevDevice] 2018/07/30 08:42:25 ERROR - Failed to red byte from device: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.20321 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Settling UdevDevice
2018-07-30_08:42:25.20322 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Running command 'udevadm settle'
2018-07-30_08:42:25.20458 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stdout:
2018-07-30_08:42:25.20460 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stderr:
2018-07-30_08:42:25.20461 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Successful: true (0)
2018-07-30_08:42:25.20462 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 0 out of 5
2018-07-30_08:42:25.20463 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20464 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.70473 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 1 out of 5
2018-07-30_08:42:25.70476 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.70477 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.20492 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 2 out of 5
2018-07-30_08:42:26.20496 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.20497 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.70509 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 3 out of 5
2018-07-30_08:42:26.70512 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.70513 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.20530 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ensuring Device Readable, Attempt 4 out of 5
2018-07-30_08:42:27.20533 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.20534 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70554 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.70557 [settingsService] 2018/07/30 08:42:27 ERROR - Failed loading settings via fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70559 [settingsService] 2018/07/30 08:42:27 ERROR - Failed reading settings from file Opening file /var/vcap/bosh/settings.json: open /var/vcap/bosh/settings.json: no such file or directory
2018-07-30_08:42:27.70560 [main] 2018/07/30 08:42:27 ERROR - App setup Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70561 [main] 2018/07/30 08:42:27 ERROR - Agent exited with error: Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.71258 [main] 2018/07/30 08:42:27 DEBUG - Starting agent


<and this whole block just keeps repeating>

标签: cloud-foundrycf-bosh

解决方案


当我将 Ops Mgr 中所有 VM 的设置更改为“默认 BOSH 密码”时,如何登录到 BOSH 创建的 VM?

这不是一个好主意。默认密码是众所周知的,您几乎应该总是使用随机生成的密码。老实说,我不确定为什么这甚至是一种选择。唯一想到的可能是一些极其罕见的故障排除方案。

也就是说,如果您需要手动访问虚拟机,您可以通过 Ops Manager 安全地获取随机生成的密码。您还可以通过 安全地访问虚拟机bosh ssh,并自动处理凭据。即使对于故障排除,您通常也不需要该选项。

/var/tempest/workspaces/default/deployments 下没有 bosh.yml。一些文档指出了这一点。所以我不知道它应用了什么设置。位置错了吗?

位置正确,但文件包含敏感信息,因此 Ops Manager 在使用完毕后立即将其删除。

如果您想查看文件的内容,最简单的方法是导航到https://ops-man-fqdn/debug/files您可以看到所有配置文件,包括您的bosh.yml. 困难的方法是在部署进行时查看上面的文件夹,您会看到该文件存在很短的时间。您可以在该窗口中制作副本。硬方法的唯一优点是您将获得实际文件,而调试端点显示一个带有敏感信息编辑的文件。

有没有办法改变 OpsMgr VM 使用的 stemcell?也许我可以尝试使用以前的版本?

我不认为这是干细胞的问题。有很多人使用这些并且没有这个问题。如果在干细胞中发现类似这样的更大问题,您会在 Pivotal Network 上看到通知,并且 Pivotal 将发布一个新的、已修复的干细胞。

问题似乎还在于 VM 如何接收其初始引导配置。我建议在弄乱干细胞之前多研究一下。见下文。

agent.json 是如何实际填充的?

信不信由你,对于 vSphere 环境,该文件是从附加到 VM 的假 CD-ROM 中读取的。文档不多,但在此处的 BOSH 文档中简要提及。

https://bosh.io/docs/cpi-api-v1-method/create-vm/#agent-settings

关于解决此问题的任何建议?

看看就明白为什么不能挂载CD-ROM了。BOSH 需要它来获得它的引导配置,所以你需要让它工作。如果您的 vSphere 环境中存在阻止挂载 CD-ROM 的内容,则需要对其进行修改以允许挂载 CD-ROM。

如果 vSphere 端没有任何内容,我认为下一步是检查下的标准系统日志/var/logdmesg输出,以查看是否有任何错误或线索说明为什么无法加载/读取 CD-ROM。

最后,尝试进行一些手动测试以从 CD-ROM 安装和读取。首先在 vSphere 客户端中查看其中一个 BOSH 部署的虚拟机,查看硬件设置并确保连接了 CD-ROM。它应该指向与env.iso数据存储上的 VM 位于同一文件夹中的文件。如果已连接并连接,请启动 VM 并尝试安装 CD-ROM。您应该能够看到该驱动器上的 BOSH 配置文件。

希望有帮助!


推荐阅读