apache-beam - ProfileOptions profile_cpu=True 的数据流作业不写入配置文件

首页 > 解决方案 > ProfileOptions profile_cpu=True 的数据流作业不写入配置文件

问题描述

我正在尝试分析在 Apache Beam Python 3.7 SDK 2.27.0 上运行的 Dataflow Pipeline 作业的 CPU 使用率。--profile_cpu我使用和args 设置触发了作业profile_location，并且可以看到它们是在 Dataflow 控制台中设置的：

显示已设置 profile_cpu 和 profile_location 的数据流管道选项。

但是，在作业完成后，没有文件写入profile_locationGSC 存储桶。

在查看 Dataflow 日志时，jsonPayload.logger:"apache_beam.utils.profiler:profiler.py"我可以看到“开始分析”和“停止分析”的日志：

显示来自Profiler.

但是没有与“将探查器数据复制到：”步骤相对应的日志，即使在profile_location中设置了ProfilingOptions，因此应该在Profiler. 任何有关可能出现问题的建议，或有关当前是否支持此功能的知识都将非常有帮助。

标签： apache-beamdataflow

解决方案

这是通过使用--experiments=use_runner_v2标志解决的。看起来这仅在 Dataflow Runner v2 上受支持，该版本尚未作为默认运行器推出。

apache-beam - ProfileOptions profile_cpu=True 的数据流作业不写入配置文件

问题描述

解决方案

推荐阅读