python - Spark 2.3 Executor 内存泄漏
问题描述
我收到内存泄漏警告,理想情况下这是一个 Spark 错误,直到 1.6 版本并已解决。
模式:独立 IDE:PyCharm Spark 版本:2.3 Python 版本:3.6
下面是堆栈跟踪 -
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3148
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3152
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3151
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3150
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3149
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3153
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3154
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3158
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3155
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3157
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3160
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3161
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3156
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3159
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3165
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3163
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3162
2018-05-25 15:00:05 WARN Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3166
关于为什么会发生的任何见解?虽然我的工作成功地完成了。
编辑:许多人说这是 2 年前的问题的重复,但那里的答案说这是一个 Spark 错误,但是当在 Spark 的 Jira 中检查时,它说它已解决。
这里的问题是,这么多版本之后,为什么我在 Spark 2.3 中仍然得到相同的结果?如果对我的查询有一些有效或合乎逻辑的答案,我肯定会删除这个问题。
解决方案
根据SPARK-14168,警告源于没有消耗整个迭代器。从 Spark shell 中的 RDD 中获取 n 个元素时,我遇到了同样的错误。
推荐阅读
- flutter - 无法打开带有 Flutter beta 1.25.0-8.1.pre 版本的 Android 应用程序
- python - 如何仅在单击按钮时复制文件?
- reactjs - getStaticProps 返回一个空对象
- c# - 确定在 C# 中调用静态方法的类
- c# - 登录后如何使用自己的 url 返回另一个视图,而不是在同一登录页面中加载?MVC
- php - 编码 PHP 时代码未更新
- python - 字典中的 Pandas 数据框
- docker - minikube 和 ingress-nginx 没有打开 80 端口
- javascript - 如何使用 highchart 绘制下面的图表?
- node.js - Drchrono 中的授权