apache-spark - mapGroupsWithState 抛出错误原因:java.lang.NoClassDefFoundError:无法初始化
问题描述
我正在尝试使用 mapGroupsWithState 读取 csv 并获取事件状态并将其写入 kafka。如果我注释掉 mapGroupsWithState peice,下面的代码就可以工作。使用火花版本 2.3.1
val event = spark.read.option("header","true").csv(path)
val eventSession = imsi.orderBy("event_timestamp")
.groupByKey(_.key)
.mapGroupsWithState(GroupStateTimeout.NoTimeout())(updateAcrossEvents)
eventSession.toJSON.write.format("kafka")
.option("kafka.bootstrap.servers", brokers)
.option("topic", outputTopic).save
错误
User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 85 in stage 11.0 failed 8 times, most recent failure: Lost task 85.7 in stage 11.0 (TID 53, XXX, executor 2): java.lang.NoClassDefFoundError: Could not initialize class xxxx$
at xxx.imsiProcessor$$anonfun$run$1$$anonfun$3.apply(xx.scala:86)
at xxx.imsiProcessor$$anonfun$run$1$$anonfun$3.apply(xx.scala:86)
at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$3.apply(KeyValueGroupedDataset.scala:279)
at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$3.apply(KeyValueGroupedDataset.scala:279)
at org.apache.spark.sql.execution.MapGroupsExec$$anonfun$12.apply(objects.scala:361)
at org.apache.spark.sql.execution.MapGroupsExec$$anonfun$12.apply(objects.scala:360)
at org.apache.spark.sql.execution.MapGroupsExec$$anonfun$10$$anonfun$apply$4.apply(objects.scala:337)
at org.apache.spark.sql.execution.MapGroupsExec$$anonfun$10$$anonfun$apply$4.apply(objects.scala:336)
Caused by: org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:367)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:933)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:924)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:924)
at com.telstra.elbrus.core.imsiProcessor$.spark$lzycompute(ImsiProcessor.scala:38)
解决方案
我能够通过摆脱一些扩展来运行代码。裸露的代码开始运行。
推荐阅读
- r - 确定函数调用中是否已给出值的方法
- html - CSS - 边框不隐藏在 Safari 中
- python - 在 python 中安装和导入模块的问题
- ruby-on-rails - Rails Activerecord:如何排除具有多个关联记录的记录
- javascript - 放大时热图单元格之间的白线(浏览器缩放)
- css - 如何使 ng-select 删除和只读
- hadoop - 纱线调度负载模拟器 - fair-scheduler
- android - 如何阻止 Android 捕获 KEYCODE_F2 和 KEY_CODE_F4?
- java - 动态元素数量布局问题
- image - 海量图片上传到mediawiki