scala - 为什么 Some(null) 在 Spark 2.4 中抛出 NullPointerException(但在 2.2 中有效)?
问题描述
此代码过去在 Spark 2.2 Scala 2.11.x 下有效,但在 Spark 2.4 中无效。
val df = Seq(
(1, Some("a"), Some(1)),
(2, Some(null), Some(2)),
(3, Some("c"), Some(3)),
(4, None, None)
).toDF("c1", "c2", "c3")
我在 Spark 2.4 中运行它,现在它给出了错误:
scala> spark.version
res0: String = 2.4.0
scala> :pa
// Entering paste mode (ctrl-D to finish)
val df = Seq(
(1, Some("a"), Some(1)),
(2, Some(null), Some(2)),
(3, Some("c"), Some(3)),
(4, None, None)
).toDF("c1", "c2", "c3")
// Exiting paste mode, now interpreting.
java.lang.RuntimeException: Error while encoding: java.lang.NullPointerException
assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._1 AS _1#6
staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._2), true, false) AS _2#7
unwrapoption(IntegerType, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._3) AS _3#8
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:293)
at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:472)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.immutable.List.foreach(List.scala:388)
at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
at scala.collection.immutable.List.map(List.scala:294)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:472)
at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228)
... 57 elided
Caused by: java.lang.NullPointerException
at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:289)
... 66 more
我很好奇发生了什么变化以及为什么要更换这条线:
(2, Some(null), Some(2)),
和:
(2, None, Some(2)),
解决了这个问题。
发生了什么变化,这对现有代码库意味着什么?
解决方案
被认为是一个错误并报告为SPARK-26984。
推荐阅读
- c++ - 如何在 VisualStudio 中将 MinGw 编译的库与 MSVC 项目链接?
- php - Laravel如何附加到分页雄辩
- nginx - Nuxt 部署中的“无法访问此页面”
- yaml - 如何从 Python OrderedDict (ruamel) 中删除嵌套条目?
- python - Python中的参数化装饰器
- php - php-fpm 可用的默认模块。哪个?
- html - Google 字体无法在 Safari 中加载
- postgresql - OSM 数据的 Postgis SQL 语句中的错误
- c# - 种子初始化程序中的重复行
- javascript - 使用 Sequelize 时,我是否必须遵循他们的表/字段名称模式?