scala - 在 spark scala 中将列表列表转换为数据框
问题描述
现在我有一个这样的列表列表:
List(
List(2,(String,String,String......),1,(String,String,String......),1,(String,String,String......)),
List(3,(String,String,String......),1,(String,String,String......),1,(String,String,String......)),
List(3,(String,String,String......),2,(String,String,String......),1,(String,String,String......)),
List(3,(String,String,String......),2,(String,String,String......),2,(String,String,String......)),
List(3,(String,String,String......),1,(String,String,String......),2,(String,String,String......))
)
我期望的输出格式如下:
+-----+------------------+-----+------------------+-----+------------------+
| _1| _2| _3| _4| _5| _6|
+-----+------------------+-----+------------------+-----+------------------+
|2 |(String,String...)|1 |(String,String...)|1 |(String,String...)|
|3 |(String,String...)|1 |(String,String...)|1 |(String,String...)|
|3 |(String,String...)|2 |(String,String...)|1 |(String,String...)|
|3 |(String,String...)|2 |(String,String...)|2 |(String,String...)|
|3 |(String,String...)|1 |(String,String...)|2 |(String,String...)|
+-----+------------------+-----+------------------+-----+------------------+
如何在 spark scala 中进行转换?我真诚地希望有人可以帮助我。
解决方案
出于测试目的,我创建了与问题中提到的相同的测试数据
val nestedList = List(
List(2,("String","String","String","String","String","String"),1,("String","String","String","String","String","String"),1,("String","String","String","String","String","String")),
List(3,("String","String","String","String","String","String"),1,("String","String","String","String","String","String"),1,("String","String","String","String","String","String")),
List(3,("String","String","String","String","String","String"),2,("String","String","String","String","String","String"),1,("String","String","String","String","String","String")),
List(3,("String","String","String","String","String","String"),2,("String","String","String","String","String","String"),2,("String","String","String","String","String","String")),
List(3,("String","String","String","String","String","String"),1,("String","String","String","String","String","String"),2,("String","String","String","String","String","String"))
)
现在您可以将内部列表转换为元组(您可以根据需要更改元组创建和类型转换中的元素数量)并调用toDF
,您应该得到所需的输出为
nestedList.map(x => (x(0).asInstanceOf[Int], x(1).toString, x(2).asInstanceOf[Int], x(3).toString, x(4).asInstanceOf[Int], x(5).toString)).toDF().show()
这应该给你
+---+--------------------+---+--------------------+---+--------------------+
| _1| _2| _3| _4| _5| _6|
+---+--------------------+---+--------------------+---+--------------------+
| 2|(String,String,St...| 1|(String,String,St...| 1|(String,String,St...|
| 3|(String,String,St...| 1|(String,String,St...| 1|(String,String,St...|
| 3|(String,String,St...| 2|(String,String,St...| 1|(String,String,St...|
| 3|(String,String,St...| 2|(String,String,St...| 2|(String,String,St...|
| 3|(String,String,St...| 1|(String,String,St...| 2|(String,String,St...|
+---+--------------------+---+--------------------+---+--------------------+
我希望答案有帮助
推荐阅读
- python - ipywidgets:寻找一种解决方案,在关闭父窗口时递归关闭所有子窗口小部件
- c# - 获取 HDR(高动态范围)是否处于活动状态的 Windows API
- java - 替换 PDF 文件中的黑色
- web-services - 如何找到wsdl版本
- oracle - 授予视图授权也要求授予目录授权
- c++ - C++ 自定义信号/槽析构函数问题
- r - 在列表列表中搜索 NA
- hibernate - Hibernate Criteria Restrictions AND combination with in 子句
- json - jsonschema:只有一个默认元素的元素列表
- reactjs - 反应)“this.props”,同时使用路由器共享“状态”