apache-spark - YARN doesn't recognize increased 'yarn.scheduler.maximum-allocation-mb' and 'yarn.nodemanager.resource.memory-mb' values
问题描述
I'm working with a dockerized pyspark cluster which utilizes yarn. To improve the efficieny of the data processing pipelines I want to increase the amount of memory allocated to the pyspark executors and the driver.
This is done by adding the following two key, value pairs to the REST post request, which is sent out to Livy:
"driverMemory": "20g" "executorMemory": "56g"
Doing this results in the following error, which I've found in Livy's logs: java.lang.IllegalArgumentException: Required executor memory (57344), overhead (5734 MB), and PySpark memory (0 MB) is above the max threshold (8192 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
Of course I've appropriately edited the yarn-site.xml and set both of the mentioned values to 64 GB by including the following lines in the file and it looks like this but it doesn't seem to make a difference.
Similar problem occurs with different driverMemory and executorMemory values if executorMemory +10% overhead is more than 8192 MB.
How can I fix this and allocate more executor memory?
解决方案
Make sure your yarn.site
looks exactly the same on your master and worker containers in the moment of starting the service.
It seems like you might edited it only on the master, which is a possible source of this confusion. As a general rule of thumb, all the config files (and many other things) must look exactly the same on all the machines in the cluster.
推荐阅读
- java - JavaFX 警告:不支持的 JavaFX 配置:类是从“未命名模块 @...”加载的
- reactjs - 使用 useEffect 反应功能组件生命周期流程,包括 [] 和 [var]
- c# - 以通用方式生成给定类型的类变量的通用列表
- google-sheets - VLOOKUP 选项小于或等于运算符,并且能够在 ARRAYFORMULA 中使用
- amazon-web-services - AWS Sagemaker Studio:Tensorboard 无法启动并显示“500:内部服务错误”消息
- swift - 按下“下一步”按钮后问题不会更新
- r - 使用神经网络包获取“需要数字/复杂矩阵/向量参数”错误 - R
- javascript - 课程和活动
- javascript - HTMLInputElement - 属性“X”的类型不兼容 - 类型“Y | undefined”不能分配给类型“Y”
- anaconda - 尝试加载 librosa 时出现 OSError(导入声音文件错误)