首页 > 解决方案 > SparkThriftServer 在 11 天后崩溃

问题描述

使用 Spark 2.3.0(在 ec2 上以独立模式运行)与 Hadoop 2.7.6 和 aws sdk 1.7.4,sparksql 查询 S3 上的 parquet 文件。上周查询一直运行良好,但在运行小查询时突然 STS 崩溃,对之前运行的相同数据的相同小查询。

有谁知道这个错误的原因?

18/10/11 04:37:04 ERROR Utils: uncaught error in thread spark-listener-group-eventLog, stopping SparkContext
java.lang.OutOfMemoryError
            at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
            at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
            at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
            at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
            at org.apache.hadoop.fs.s3a.S3AFastOutputStream.write(S3AFastOutputStream.java:194)
            at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
            at java.io.DataOutputStream.write(DataOutputStream.java:107)
            at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
            at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
            at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
            at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
            at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
            at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129)
            at java.io.BufferedWriter.write(BufferedWriter.java:230)
            at java.io.PrintWriter.write(PrintWriter.java:456)
            at java.io.PrintWriter.write(PrintWriter.java:473)
            at java.io.PrintWriter.print(PrintWriter.java:603)
            at java.io.PrintWriter.println(PrintWriter.java:739)
            at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$1.apply(EventLoggingListener.scala:143)
            at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$1.apply(EventLoggingListener.scala:143)
            at scala.Option.foreach(Option.scala:257)
            at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:143)
            at org.apache.spark.scheduler.EventLoggingListener.onTaskEnd(EventLoggingListener.scala:164)
            at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:45)
            at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
            at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
            at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:82)
            at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:89)
            at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:89)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:83)
            at org.apache.spark.scheduler.AsyncEventQueue$$anon$1$$anonfun$run$1.apply$mcV$sp(AsyncEventQueue.scala:79)
            at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1265)
            at org.apache.spark.scheduler.AsyncEventQueue$$anon$1.run(AsyncEventQueue.scala:78)
18/10/11 04:37:04 INFO TaskSetManager: Finished task 28.0 in stage 13058.0 (TID 553499) in 252 ms on 10.13.5.60 (executor 0) (16/63)
18/10/11 04:37:04 INFO TaskSetManager: Finished task 23.0 in stage 13058.0 (TID 553494) in 253 ms on 10.13.5.60 (executor 0) (17/63)
18/10/11 04:37:04 ERROR Utils: throw uncaught fatal error in thread spark-listener-group-eventLog
java.lang.OutOfMemoryError
18/10/11 04:37:04 INFO HiveServer2: Shutting down HiveServer2
18/10/11 04:37:04 INFO ThriftCLIService: Thrift server has stopped
18/10/11 04:37:04 INFO AbstractService: Service:ThriftBinaryCLIService is stopped.
18/10/11 04:37:04 INFO AbstractService: Service:OperationManager is stopped.
18/10/11 04:37:04 INFO AbstractService: Service:SessionManager is stopped.

标签: apache-sparkhadoopamazon-s3hivehortonworks-data-platform

解决方案


推荐阅读