apache-flink - Flink EMR Deployment can't pick up Yarn context and executes only as a local application
问题描述
I'm deploying my Flink app on AWS EMR 6.2
Flink version: 1.11.2
I configured the step with:
Arguments :"flink-yarn-session -d -n 2"
as described here: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/flink-jobs.html
Both the cluster and the step are in state 'running'.
The controller logs shows:
INFO startExec 'hadoop jar /mnt/var/lib/hadoop/steps/s-309XLKSZGN4V4/trigger-service-***.jar flink-yarn-session -d -n 2'
However, the syslog show that the application didn't pick-up on the Yarn context.
2021-02-24 13:19:57,772 INFO org.apache.flink.runtime.minicluster.MiniCluster (main): Starting Flink Mini Cluster
I pulled out the class name returned by StreamExecutionEnvironment.getExecutionEnvironment
in my app, and indeed it is LocalStreamEnvironment
.
The application itself runs properly on the JobMaster instance as a local app.
解决方案
事实证明 -n 在新版本的 flink 中已被弃用,因此,根据 AWS 支持的建议,我使用了以下命令:
flink-yarn-session -d -s 2
参数含义:
-d,--detached 开始分离
-s,--slots 每个 TaskManager 的槽数
此外,为了使 Flink 需要创建两个步骤:
--steps '[{"Args":["flink-yarn-session","-d","-s","2"],"Type":"CUSTOM_JAR","MainClass":"","ActionOnFailure":"CANCEL_AND_WAIT","Jar":"command-runner.jar","Properties":"","Name":"Yarn Long Running Session"},{"Args":["bash","-c","aws s3 cp s3://<URL of Application Jar> .;/usr/bin/flink run -m yarn-cluster -p 2 -d <Application Jar>"],"Type":"CUSTOM_JAR","MainClass":"","ActionOnFailure":"CANCEL_AND_WAIT","Jar":"command-runner.jar","Properties":"","Name":"Flink Job"}]
'
第一个使用 command-runner.jar 执行
flink 纱线会话 -d -s 2
第二个使用 command-runner.jar 从 S3 下载我的应用程序 Jar 并在 yarn-cluster 上使用 flink 运行作业。
推荐阅读
- java - 有没有办法用 MLKit 检测文本的大小
- c++ - 这两个位运算符在做什么?
- c# - GDI+ DrawImage 在 C++ (Win32) 中比在 C# (WinForms) 中慢得多
- jquery - Jquery 不使用 next() 方法遍历
- rust - Rust 和可变性中的模式匹配
- macos - pip:当前(Python 3.6+)相当于`npm -g install some-global-package`?
- c# - EF Core 3.1 一对一关系,对象引用未设置为对象的实例
- javascript - 使用 Object.keys 从 JavaScript 中 JSON 文件中的值获取的多维数组
- python - 从非常嘈杂的二进制阈值图像中过滤噪声
- python - 如何使用 ebpf python 从打开的系统调用中打印文件路径?