首页 > 解决方案 > Azure 数据工厂 webhook 执行超时而不是中继错误

问题描述

我尝试在 Azure 数据工厂 (v2) 中设置一个简单的 webhook 执行,为我设置的 Azure 自动化运行手册调用一个简单的(无参数)webhook。

从 Azure 门户,我可以看到 webhook 正在执行,并且我的 runbook 正在运行,到目前为止一切都很好。运行手册(当前)在执行后 1 分钟内返回错误 - 但这很好,我还想测试失败场景。

问题:数据工厂似乎没有“看到”错误结果并旋转直到超时(10 分钟)过去。当我启动管道的调试运行时,我得到了相同的结果 - 超时并且没有错误结果。

更新:我已经修复了运行手册,现在它已成功完成,但数据工厂仍然超时,也没有看到成功响应。

这是设置的屏幕截图:

在此处输入图像描述

这里是确认 webhook 正在由 azure data factory 运行的门户,并在一分钟内完成:

在此处输入图像描述

WEBHOOKDATA JSON 是:

{"WebhookName":"Start CAMS VM","RequestBody":"{\r\n \"callBackUri\": \"https://dpeastus.svc.datafactory.azure.com/dataplane/workflow/callback/f7c...df2?callbackUrl=AAEAFF...0927&shouldReportToMonitoring=True&activityType=WebHook\"\r\n}","RequestHeader":{"Connection":"Keep-Alive","Expect":"100-continue","Host":"eab...ddc.webhook.eus2.azure-automation.net","x-ms-request-id":"7b4...2eb"}}

所以据我所知,应该准备好了解结果(失败的成功)。希望以前做过这件事的人知道我错过了什么。

谢谢!

标签: azurewebhooksazure-data-factory-2azure-automationazure-runbook

解决方案


我曾假设 Azure 会在 Runbook 完成或出错后自动通知 ADF“callBackUri”结果(因为他们负责 99% 的脚手架而不需要一行代码)。

事实证明并非如此,任何希望从 ADF 执行 Runbook 的人都必须手动从 Webhookdata 输入参数中提取 callBackUri,并在完成后将结果 POST 给它。

我还没有确定这一点,因为我发现的Microsoft 教程网站有一个坏习惯,即截取执行此操作的代码而不是提供代码本身:

在此处输入图像描述

我想一旦我弄清楚了,我会回来编辑这个。


编辑我最终通过保持原始 Webhook 保持不变来实现这一点,并创建一个“包装器”/helper/utility Runbook,它将执行任意 Webhook,并在完成后将其状态传递给 ADF。

这是我最终得到的完整代码,以防它帮助其他人。它是通用的:

设置/辅助功能

param
(
    [Parameter (Mandatory = $false)]
    [object] $WebhookData
)

Import-Module -Name AzureRM.resources
Import-Module -Name AzureRM.automation

# Helper function for getting the current running Automation Account Job
# Inspired heavily by: https://github.com/azureautomation/runbooks/blob/master/Utility/ARM/Find-WhoAmI
<#
    Queries the automation accounts in the subscription to find the automation account, runbook and resource group that the job is running in.
    AUTHOR: Azure/OMS Automation Team
#>
Function Find-WhoAmI {
    [CmdletBinding()]
    Param()
    Begin { Write-Verbose ("Entering {0}." -f $MyInvocation.MyCommand) }
    Process {
        # Authenticate
        $ServicePrincipalConnection = Get-AutomationConnection -Name "AzureRunAsConnection"
        Add-AzureRmAccount `
            -ServicePrincipal `
            -TenantId $ServicePrincipalConnection.TenantId `
            -ApplicationId $ServicePrincipalConnection.ApplicationId `
            -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint | Write-Verbose
        Select-AzureRmSubscription -SubscriptionId $ServicePrincipalConnection.SubscriptionID | Write-Verbose 
        # Search all accessible automation accounts for the current job
        $AutomationResource = Get-AzureRmResource -ResourceType Microsoft.Automation/AutomationAccounts
        $SelfId = $PSPrivateMetadata.JobId.Guid
        foreach ($Automation in $AutomationResource) {
            $Job = Get-AzureRmAutomationJob -ResourceGroupName $Automation.ResourceGroupName -AutomationAccountName $Automation.Name -Id $SelfId -ErrorAction SilentlyContinue
            if (!([string]::IsNullOrEmpty($Job))) {
                return $Job
            }
            Write-Error "Could not find the current running job with id $SelfId"
        }
    }
    End { Write-Verbose ("Exiting {0}." -f $MyInvocation.MyCommand) }
}

Function Get-TimeStamp {    
    return "[{0:yyyy-MM-dd} {0:HH:mm:ss}]" -f (Get-Date)    
}

我的代码


### EXPECTED USAGE ###
# 1. Set up a webhook invocation in Azure data factory with a link to this Runbook's webhook
# 2. In ADF - ensure the body contains { "WrappedWebhook": "<your url here>" }
#    This should be the URL for another webhook.
# LIMITATIONS:
# - Currently, relaying parameters and authentication credentials is not supported,
#    so the wrapped webhook should require no additional authentication or parameters.
# - Currently, the callback to Azure data factory does not support authentication,
#    so ensure ADF is configured to require no authentication for its callback URL (the default behaviour)

# If ADF executed this runbook via Webhook, it should have provided a WebhookData with a request body.
if (-Not $WebhookData) {
    Write-Error "Runbook was not invoked with WebhookData. Args were: $args"
    exit 0
}
if (-Not $WebhookData.RequestBody) {
    Write-Error "WebhookData did not contain a ""RequestBody"" property. Data was: $WebhookData"
    exit 0
}
$parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
# And this data should contain a JSON body containing a 'callBackUri' property.
if (-Not $parameters.callBackUri) {
    Write-Error 'WebhookData was missing the expected "callBackUri" property (which Azure Data Factory should provide automatically)'
    exit 0
}
$callbackuri = $parameters.callBackUri

# Check for the "WRAPPEDWEBHOOK" parameter (which should be set up by the user in ADF)
$WrappedWebhook = $parameters.WRAPPEDWEBHOOK
if (-Not $WrappedWebhook) {
    $ErrorMessage = 'WebhookData was missing the expected "WRAPPEDWEBHOOK" peoperty (which the user should have added to the body via ADF)'
    Write-Error $ErrorMessage
}
else
{
    # Now invoke the actual runbook desired
    Write-Output "$(Get-TimeStamp) Invoking Webhook Request at: $WrappedWebhook"
    try {    
        $OutputMessage = Invoke-WebRequest -Uri $WrappedWebhook -UseBasicParsing -Method POST
    } catch {
        $ErrorMessage = ("An error occurred while executing the wrapped webhook $WrappedWebhook - " + $_.Exception.Message)
        Write-Error -Exception $_.Exception
    }
    # Output should be something like: {"JobIds":["<JobId>"]}
    Write-Output "$(Get-TimeStamp) Response: $OutputMessage"    
    $JobList = (ConvertFrom-Json -InputObject $OutputMessage).JobIds
    $JobId = $JobList[0]
    $OutputMessage = "JobId: $JobId"         

    # Get details about the currently running job, and assume the webhook job is being run in the same resourcegroup/account
    $Self = Find-WhoAmI
    Write-Output "Current Job '$($Self.JobId)' is running in Group '$($Self.ResourceGroupName)' and Automation Account '$($Self.AutomationAccountName)'"
    Write-Output "Checking for Job '$($JobId)' in same Group and Automation Account..."

    # Monitor the job status, wait for completion.
    # Check against a list of statuses that likely indicate an in-progress job
    $InProgressStatuses = ('New', 'Queued', 'Activating', 'Starting', 'Running', 'Stopping')
    # (from https://docs.microsoft.com/en-us/powershell/module/az.automation/get-azautomationjob?view=azps-4.1.0&viewFallbackFrom=azps-3.7.0)  
    do {
        # 1 second between polling attempts so we don't get throttled
        Start-Sleep -Seconds 1
        try { 
            $Job = Get-AzureRmAutomationJob -Id $JobId -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
        } catch {
            $ErrorMessage = ("An error occurred polling the job $JobId for completion - " + $_.Exception.Message)
            Write-Error -Exception $_.Exception
        }
        Write-Output "$(Get-TimeStamp) Polled job $JobId - current status: $($Job.Status)"
    } while ($InProgressStatuses.Contains($Job.Status))

    # Get the job outputs to relay to Azure Data Factory
    $Outputs = Get-AzureRmAutomationJobOutput -Id $JobId -Stream "Any" -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
    Write-Output "$(Get-TimeStamp) Outputs from job: $($Outputs | ConvertTo-Json -Compress)"
    $OutputMessage = $Outputs.Summary
    Write-Output "Summary ouput message: $($OutputMessage)"
}

# Now for the entire purpose of this runbook - relay the response to the callback uri.
# Prepare the success or error response as per specifications at https://docs.microsoft.com/en-us/azure/data-factory/control-flow-webhook-activity#additional-notes
if ($ErrorMessage) {
    $OutputJson = @"
{
    "output": { "message": "$ErrorMessage" },
    "statusCode": 500,
    "error": {
        "ErrorCode": "Error",
        "Message": "$ErrorMessage"
    }
}
"@
} else {
    $OutputJson = @"
{
    "output": { "message": "$OutputMessage" },
    "statusCode": 200
}
"@
}
Write-Output "Prepared ADF callback body: $OutputJson"
# Post the response to the callback URL provided
$callbackResponse = Invoke-WebRequest -Uri $callbackuri -UseBasicParsing -Method POST -ContentType "application/json" -Body $OutputJson

Write-Output "Response was relayed to $callbackuri"
Write-Output ("ADF replied with the response: " + ($callbackResponse | ConvertTo-Json -Compress))

在高层次上,我采取的步骤是:

  1. 执行“主”Webhook - 取回“作业 ID”
  2. 获取当前正在运行的作业的“上下文”(资源组和自动化帐户信息),以便我可以轮询远程作业。
  3. 轮询作业直到完成
  4. 以 Azure 数据工厂期望的格式组合“成功”或“错误”响应消息。
  5. 调用 ADF 回调。

推荐阅读