apache-kafka - Flink not forwarding Kafka metrics when parallelism greater than 1
问题描述
I have a Flink job which reads from Kafka (v0.9) and writes to Redis. I want to monitor the records-consumed-rate
and records-lag-max
metrics emitted by Kafka which Flink should be able to forward. In this case, I am forwarding to Datadog.
When I start the job with a parallelism of 1, I see this metric emitted just fine. However, if I make the parallelism greater than 1, this metric is no longer forwarded. The job is running when parallelism > 1 because I can see the entries being written to Redis.
I'm running Flink (v1.6.2) on AWS EMR:
- master node: (1) m4.large
- core node: (1) c4.2xlarge
- num.task.managers: 1
- slots.per.task.manager: 7
- parallelism: 7
The parallelism is set by streamExecutionEnvironment.setParallelism(). Each Kafka Consumer is instantiated with the same group.id and a unique client.id.
The DD agent is running just fine on the cluster. Many metrics are being emitted such as numberOfCompletedCheckpoints and upTime etc.
Is there any reason Flink would not be forwarding these metrics from Kafka if the parallelism is greater than 1?
Update:
I also tried sending a custom DD metric (counter.inc()
) from the Redis RichSinkFunction. When the parallelism=1, the metric is sent fine. When parallelism=7, the metric is not sent however it is being called (added a debug line). So it seems its not limited to the forwarded metrics from Kafka.
解决方案
问题是 HTTPRequest 的大小越大,并行度越高,这是有意义的。我回来了“请求实体太大”,但是异常没有正确注销,所以我错过了。
Flink DatadogHttpReporter在构建时似乎没有考虑请求的大小。我修改了 Reporter 以将每个请求的指标数量限制为 1000。现在指标显示得很好。
推荐阅读
- proxy - 我无法通过代理访问我的 apollo 服务器
- java - 带有 Redis Sentinel 的 Spring Boot 缓存始终连接到主节点
- javascript - 使上传的图像文件可以通过 URL 访问的正确方法是什么 - google drive api v3
- python - 在 Python 中创建一个 json 对象
- javascript - 绕过浏览器音频处理
- python - 如何使用列表值创建颜色渐变 1D 热图(带状图)?
- c++ - 为多态类实现复制构造函数
- r - 如何使用闪亮的输入来过滤已编辑的数据表?
- objective-c - 从 Objective-c 调用快速关闭
- html - 在 Angular/HTML 中单击时更改元素的文本?