python - python pandas合并内存总线错误破坏python环境
问题描述
编辑/更新:在没有火花的情况下运行(仅使用 pandas 合并)导致总线错误并重现问题。
我确实使用本地上下文和熊猫运行 pyspark。从镶木地板文件中提取一些列后,我将它们一一合并到数据框中。由于日期问题,由于多对多合并,我的程序内存不足(预期的)。之后,我看到来自 java 的长“调用堆栈”。令人惊讶的是,我的 anaconda 安装被破坏了,每次都以不同的方式,但看起来“某人”正在随机写入一些 anaconda python 文件。每次失败时,我都尝试了 10 次重新安装和重新运行。如果我在内存不足之前停止程序,python 环境将保持完整功能。
python错误之一:
Fatal Python error: initsite: Failed to import the site module
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site.py", line 579, in <module>
main()
File "/opt/anaconda3/lib/python3.7/site.py", line 566, in main
known_paths = addsitepackages(known_paths)
File "/opt/anaconda3/lib/python3.7/site.py", line 349, in addsitepackages
addsitedir(sitedir, known_paths)
File "/opt/anaconda3/lib/python3.7/site.py", line 207, in addsitedir
addpackage(sitedir, name, known_paths)
File "/opt/anaconda3/lib/python3.7/site.py", line 163, in addpackage
for n, line in enumerate(f):
File "/opt/anaconda3/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 1: invalid continuation byte
pyspark:
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10000 milliseconds]. This timeout is controlled by spark.executor.heartbeatInterval
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:841)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:870)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:870)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:870)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:870)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 14 more
19/12/02 21:09:18 WARN SparkContext: Ignoring Exception while stopping SparkContext from shutdown hook
java.lang.NoClassDefFoundError: io/netty/channel/AbstractChannelHandlerContext$13
at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:610)
at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
at io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:973)
at io.netty.channel.AbstractChannel.close(AbstractChannel.java:238)
at org.apache.spark.network.server.TransportServer.close(TransportServer.java:153)
at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:180)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1615)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:90)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1974)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1340)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1973)
at org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:575)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.lang.ClassNotFoundException: io.netty.channel.AbstractChannelHandlerContext$13
at java.net.URLClassLoader$1.run(URLClassLoader.java:370)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 24 more
Caused by: java.util.zip.ZipException: invalid LOC header (bad signature)
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1400(ZipFile.java:60)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:734)
at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:434)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at sun.misc.Resource.getBytes(Resource.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:462)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
... 30 more
解决方案
推荐阅读
- c++ - 构造函数的类模板参数包扩展
- python - Python:如何在熊猫中将列相乘?
- sas - 使用带有空 BASE 的 PROC APPEND 时定义 BASE 结构
- php - wordpress登录重定向后如何传递GET变量
- python - try..except 在交互式 Windbg Python 会话中让我退出 Python 会话
- django - Django:注释缺失的日期
- javascript - 如何在没有 Nodemon 的情况下构建 Nodejs 应用程序并部署在本地 LAN 网络上
- mysql - MYSQL:获取上一行但基于非主列
- reactjs - 如何在反应js中将默认索引页面更改为登录页面
- maven - 带有一个参数库的 AssertThat