amazon-web-services - AWS Cloudwatch 警报简化警报以配置到整个粘合作业集群
问题描述
我正在尝试创建一个 cloudformation 模板来创建基于 AWS Glue 指标的 cloudwatch 警报。截至目前,代码很长,因为我必须添加胶水作业中使用的所有工人来检查作业的 CPU。
CpuLoadAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Ref CampaignCpuLoadAlarmName
ActionsEnabled: true
AlarmActions:
- !Ref AssistGlueJobsMonitoringSNSTopic
EvaluationPeriods: !Ref AlarmEvaluationPeriod
DatapointsToAlarm: !Ref AlarmDatapointsToAlarm
Threshold: !Ref AlarmThreshold
ComparisonOperator: GreaterThanOrEqualToThreshold
TreatMissingData: missing
Metrics:
- Id: e1
Label: Expression1
ReturnData: true
Expression: !Ref AlarmExpression
- Id: m1
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.1.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m2
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.2.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m3
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.3.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m4
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.4.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m5
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.5.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m6
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.6.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m7
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.7.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m8
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.8.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m9
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.9.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
- Id: m10
ReturnData: false
MetricStat:
Metric:
Namespace: Glue
MetricName: glue.driver.system.cpuSystemLoad
Dimensions:
- Name: Type
Value: gauge
- Name: JobRunId
Value: ALL
- Name: JobName
Value: !Ref JobName
Period: !Ref AlarmPeriod
Stat: Average
是否可以简化它,以便我们可以在整个粘合作业集群上使用阈值,这样我们就不必指定驱动程序和每个工作人员?
解决方案
推荐阅读
- cron - 在 12:00 AM 到 12:59 AM 之间每 20 分钟执行一次 Cron 表达式,仅在星期六
- backend - 在没有浏览器的情况下运行 DHTMLX API
- c++ - 单击一个按钮后创建一个按钮
- vue.js - 如果在所选语言文件中没有找到翻译,如何显示英文翻译?
- css - 如何使用 CSS 重现设备下的阴影
- c# - c# can't get a random to execute a statement
- linux - 两个文本文件之间的 MD5 比较
- php - php-fpm 无法解析主机“sandbox.itunes.apple.com”
- azure - 我可以在 C# 中创建非静态 Azure 函数类,有什么后果?
- sql - 使用日期列表连接表 - 包括对 NULL 的引用