java - java.lang.RuntimeException:scala.collection.immutable.$colon$colon 不是 struct<513:int,549:int> 架构的有效外部类型
问题描述
我想用这个模式创建一个数据框:
|-- Col1 : string (nullable = true)
|-- Col2 : string (nullable = true)
|-- Col3 : struct (nullable = true)
| |-- 513: long (nullable = true)
| |-- 549: long (nullable = true)
代码:
val someData = Seq(
Row("AAAAAAAAA", "BBBBB", Seq(513, 549))
)
val col3Fields = Seq[StructField](StructField.apply("513",IntegerType, true), StructField.apply("549",IntegerType, true))
val someSchema = List(
StructField("Col1", StringType, true),
StructField("Col2", StringType, true),
StructField("Col3", StructType.apply(col3Fields), true)
)
val someDF = spark.createDataFrame(
spark.sparkContext.parallelize(someData),
StructType(someSchema)
)
someDF.show
但是someDF.show
抛出:
错误执行程序:阶段 0.0 (TID 0) 中任务 0.0 中的异常 java.lang.RuntimeException:编码时出错:java.lang.RuntimeException:scala.collection.immutable.$colon$colon 不是结构模式的有效外部类型<513:int,549:int> if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, Col1), StringType), true, false) AS Col1#0 if (assertnotnull(input[ 0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org .apache.spark.sql.Row, true]), 1, Col2),StringType), true, false) AS Col2#1 if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else named_struct(513, if (validateexternaltype(getexternalrowfield(assertnotnull(input [0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)).isNullAt) null else validateexternaltype(getexternalrowfield(validateexternaltype) (getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)), 0, 513 ), IntegerType), 549, if (validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549 ,整数类型,真))。isNullAt) null else validateexternaltype(getexternalrowfield(validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549, IntegerType,true)), 1, 549), IntegerType)) AS Col3#2 at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:291)
编辑:
513 和 549 应该是子列名称而不是值。这是我期望的输出示例:
someDF.select("Col1","Col2","Col3.*").show
+-----------+--------+------+------+
| Col1| Col1| 513| 549|
+-----------+--------+------+------+
| AAAAAAAAA | BBBBB | 39| 38|
+-----------+--------+------+------+
解决方案
你拥有的数据和你拥有的Schema不一样,你想创建的Schema就是你如何创建的
val schema = StructType(
Seq(
StructField("col1", StringType, true),
StructField("col2", StringType, true),
StructField("col3", StructType(
Seq(
StructField("513", LongType, true),
StructField("549", LongType, true)
))
)
)
)
架构:
root
|-- col1: string (nullable = true)
|-- col2: string (nullable = true)
|-- col3: struct (nullable = true)
| |-- 513: long (nullable = true)
| |-- 549: long (nullable = true)
这为您提供了您想要的架构
您可以获取如下数据并应用架构
val someData = Seq(
Row("AAAAAAAAA", " BBBBB", Row(39l, 38l))
)
val someDF = spark.createDataFrame(
spark.sparkContext.parallelize(someData), schema
)
df.select("Col1","Col2","Col3.*").show
输出:
+---------+-------+---+---+
| Col1| Col2|513|549|
+---------+-------+---+---+
|AAAAAAAAA| BBBBB| 39| 38|
+---------+-------+---+---+
推荐阅读
- typescript - 尝试发送响应时出现 Express 错误的 TS
- sql - 用as()是什么意思?
- c++ - Visual Studio 有时会加载符号,有时不会
- swiftui - 这是 iOS14 中预期的 @State var 行为还是一个错误?
- python - 查找 DataFrame 中是否存在特定值。如果不是,请将其存储在新的 DataFrame 中
- javascript - 仅接受来自一个 div 的字段值的值来自多个 div 的字段值
- python - 我应该如何更改随机输入以启用绝对随机化?
- javascript - 仅替换字符串单词中的几个字符
- php - php curl 无法连接到 xxx 端口 5001:连接被拒绝
- ios - 父 View 中的状态更改创建子 ViewModel 的新实例