首页 > 解决方案 > Azure ARM - 自定义脚本扩展 - 随机失败

问题描述

我开发了一个 Azure ARM 模板来部署一个 Ubuntu Linux 机器,一旦配置了一个 bash 脚本,它就会运行以安装特定的软件。该软件涉及下载一些软件包以及从用户那里传递一个输入参数以完成配置。我面临的问题是脚本扩展似乎间歇性地工作。我成功部署过一次,现在一直失败。这是自定义脚本开始执行几秒钟后返回的错误:

    {
  "code": "DeploymentFailed",
  "message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
  "details": [
    {
      "code": "Conflict",
      "message": "{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"VMExtensionProvisioningError\",\r\n        \"message\": \"VM has reported a failure when processing extension 'metaport-onboard'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=1\\n[stdout]\\nReading package lists...\\nBuilding dependency tree...\\nReading state information...\\nsoftware-properties-common is already the newest version (0.96.24.32.14).\\nsoftware-properties-common set to manually installed.\\nThe following packages were automatically installed and are no longer required:\\n  grub-pc-bin linux-headers-4.15.0-121\\nUse 'sudo apt autoremove' to remove them.\\n0 upgraded, 0 newly installed, 0 to remove and 18 not upgraded.\\nReading package lists...\\nBuilding dependency tree...\\nReading state information...\\nSome packages could not be installed. This may mean that you have\\nrequested an impossible situation or if you are using the unstable\\ndistribution that some required packages have not yet been created\\nor been moved out of Incoming.\\nThe following information may help to resolve the situation:\\n\\nThe following packages have unmet dependencies:\\n python3-pip : Depends: python3-distutils but it is not installable\\n               Recommends: build-essential but it is not installable\\n               Recommends: python3-dev (>= 3.2) but it is not installable\\n               Recommends: python3-setuptools but it is not installable\\n               Recommends: python3-wheel but it is not installable\\n\\n[stderr]\\n+ sudo apt-get -qq -y update\\n+ sudo apt-get -q -y install software-properties-common\\n+ sudo apt-get -q -y install python3-pip\\nE: Unable to correct problems, you have held broken packages.\\nNo passwd entry for user 'mpadmin'\\n\\\"\\r\\n\\r\\nMore information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot \"\r\n      }\r\n    ]\r\n  }\r\n}"
    }
  ]
}

下面是我定义扩展的模板部分

    {
  "type": "Microsoft.Compute/virtualMachines",
  "name": "[variables('vmName')]",
  "apiVersion": "2019-12-01",
  "location": "[variables('location')]",
  "dependsOn": [
    "[resourceId('Microsoft.Network/networkInterfaces/', variables('nicName'))]",
    "[resourceId('Microsoft.Network/virtualNetworks', parameters('virtualNetworkName'))]",
    "[resourceId('Microsoft.Network/natGateways', variables('natGatewayName'))]",
    "[resourceId('Microsoft.Network/networkSecurityGroups', variables('networkSecurityGroupName'))]"
  ],
  "properties": {
    "hardwareProfile": {
      "vmSize": "[parameters('virtualMachineSize')]"
    },
    "osProfile": {
      "computerName": "[variables('vmName')]",
      "adminUsername": "[parameters('adminUsername')]",
      "adminPassword": "[parameters('adminPasswordOrKey')]",
      "linuxConfiguration": "[if(equals(parameters('authenticationType'), 'password'), json('null'), variables('linuxConfiguration'))]"
    },
    "storageProfile": {
      "imageReference": {
        "publisher": "[variables('imagePublisher')]",
        "offer": "[variables('imageOffer')]",
        "sku": "[variables('imageSKU')]",
        "version": "[variables('imageVersion')]"
      },
      "osDisk": {
        "name": "[concat(variables('vmName'), '_OSDisk')]",
        "caching": "ReadWrite",
        "createOption": "FromImage",
        "managedDisk": {
          "storageAccountType": "[variables('storageAccountType')]"
        }
      }
    },
      "networkProfile": {
        "networkInterfaces": [
          {
            "id": "[resourceId('Microsoft.Network/networkInterfaces',variables('nicName'))]"
          }
        ]
      }
    },
    "resources": [
          {
          "name": "metaport-onboard",
          "type": "extensions",
          "apiVersion": "2019-03-01",
          "location": "[resourceGroup().location]",
          "dependsOn": [
            "[resourceId('Microsoft.Compute/virtualMachines/', variables('vmName'))]",
            "[resourceId('Microsoft.Network/networkInterfaces',variables('nicName'))]",
            "[resourceId('Microsoft.Network/virtualNetworks', parameters('virtualNetworkName'))]",
            "[resourceId('Microsoft.Network/natGateways', variables('natGatewayName'))]",
            "[resourceId('Microsoft.Network/networkSecurityGroups', variables('networkSecurityGroupName'))]"
          ],
          "properties": {
            "publisher": "Microsoft.Azure.Extensions",
            "type": "CustomScript",
            "typeHandlerVersion": "2.1",
            "autoUpgradeMinorVersion": true,
            "settings": {
              "fileUris": [
                "https://raw.githubusercontent.com/willguibr/azure/main/Latest/MetaPort-Standalone-NATGW-v1.0/install_metaport.sh"
                ]
              },
            "protectedSettings": {
              "commandToExecute": "[concat('sh install_metaport.sh ', parameters('metaTokenCode'))]"
              }
            }
          }
        ]
      }
    ]
  }

完整的模板包在这里

任何人都知道如何防止此问题或实施任何可能需要的更正?

标签: azureazure-devopsazure-devops-extensions

解决方案


好吧,这清楚地表明:脚本以代码 1 退出。这意味着脚本本身失败。所以你需要登录到 vm 并查看 c:\windowsazure\packages\logs (或类似的东西)的扩展日志,并找出问题所在并用一些 try\catch 逻辑包装它。另外,考虑将错误传播到控制台,以便您可以在日志中实际看到它们。


推荐阅读