首页 > 解决方案 > 在云形成模板的数据管道对象列表中添加 EMR 配置时数据管道验证错误

问题描述

如果我在数据管道对象列表中添加“配置”对象,则会收到错误消息:

Pipeline Definition failed to validate because of following Errors:
[{ObjectId = 'SampleEMRCluster', errors = [Fields with references 
to scheduable objects or preconditions can not be added to existing objects.
Found 'configuration']}]

在添加此之前,合成器和部署工作正常,数据管道也工作正常。以下是合成云形成模板的相关部分:

        "PipelineObjects": [
      {
        "Fields": [
          {
            "Key": "type",
            "StringValue": "Default"
          },
          {
            "Key": "maxActiveInstances",
            "StringValue": "1"
          },
          {
            "Key": "scheduleType",
            "StringValue": "cron"
          },
          {
            "Key": "pipelineLogUri",
            "StringValue": {
              "Fn::Join": [
                "",
                [
                  "s3://",
                  {
                    "Ref": "sampleprodnaA928775C"
                  },
                  "/data-pipeline-logs/"
                ]
              ]
            }
          },
          {
            "Key": "role",
            "StringValue": {
              "Ref": "DPRoleprodna120283D1"
            }
          },
          {
            "Key": "resourceRole",
            "StringValue": {
              "Ref": "DPResourceRoleprodna6634AAB4"
            }
          },
          {
            "Key": "failureAndRerunMode",
            "StringValue": "CASCADE"
          },
          {
            "Key": "schedule",
            "RefValue": "DefaultSchedule"
          }
        ],
        "Id": "Default",
        "Name": "Default"
      },
      {
        "Fields": [
          {
            "Key": "type",
            "StringValue": "Schedule"
          },
          {
            "Key": "startAt",
            "StringValue": "FIRST_ACTIVATION_DATE_TIME"
          },
          {
            "Key": "period",
            "StringValue": "1 hour"
          }
        ],
        "Id": "DefaultSchedule",
        "Name": "Every 1 hour"
      },
      {
        "Fields": [
          {
            "Key": "type",
            "StringValue": "EmrCluster"
          },
          {
            "Key": "coreInstanceType",
            "StringValue": "i3.xlarge"
          },
          {
            "Key": "coreInstanceCount",
            "StringValue": "1"
          },
          {
            "Key": "masterInstanceType",
            "StringValue": "i3.xlarge"
          },
          {
            "Key": "terminateAfter",
            "StringValue": "1 hour"
          },
          {
            "Key": "resourceRole",
            "StringValue": "EMR_EC2_DefaultRole"
          },
          {
            "Key": "role",
            "StringValue": "EMR_DefaultRole"
          },
          {
            "Key": "subnetId",
            "StringValue": {
              "Ref": "VpcPrivateSubnet1Subnet536B997F"
            }
          },
          {
            "Key": "emrManagedMasterSecurityGroupId",
            "StringValue": {
              "Ref": "EMRControllerC4OFF237"
            }
          },
          {
            "Key": "emrManagedSlaveSecurityGroupId",
            "StringValue": {
              "Ref": "EMRWorkerE1C2639A"
            }
          },
          {
            "Key": "serviceAccessSecurityGroupId",
            "StringValue": {
              "Ref": "EMRServiceAccessB1B4D1B5"
            }
          },
          {
            "Key": "releaseLabel",
            "StringValue": "emr-5.30.0"
          },
          {
            "Key": "configuration",
            "RefValue": "SparkConfiguration"
          }
        ],
        "Id": "SampleEMRCluster",
        "Name": "SampleEMRCluster"
      },
      {
        "Fields": [
          {
            "Key": "type",
            "StringValue": "EmrConfiguration"
          },
          {
            "Key": "classification",
            "StringValue": "spark"
          },
          {
            "Key": "property",
            "RefValue": "sparkProperty01"
          }
        ],
        "Id": "SparkConfiguration",
        "Name": "SparkConfiguration"
      },
      {
        "Fields": [
          {
            "Key": "type",
            "StringValue": "Property"
          },
          {
            "Key": "key",
            "StringValue": "maximizeResourceAllocation"
          },
          {
            "Key": "value",
            "StringValue": "true"
          }
        ],
        "Id": "sparkProperty01",
        "Name": "sparkHiveSiteProperty01"
      },
      ...//other pipeline objects
]

有人可以帮我理解模板中有什么问题吗?

标签: amazon-cloudformationaws-cdkamazon-data-pipeline

解决方案


在 AWS 中创建数据管道后,某些字段无法编辑(添加配置、更改 emr 步骤依赖项似乎是其中的一部分)。手动删除 UI 中的堆栈并重试有效。关于哪些字段无法编辑的一些文档:https ://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-manage-pipeline-modify-console.html


推荐阅读