首页 > 解决方案 > 无法将 scala 对象转换为 spark 数据帧

问题描述

我有一个scala对象被传递给该dashBoardInsert方法,并且我已经交叉检查我正在通过参数接收数据。

现在我想将其转换为dataframe,但出现以下错误:

 def dashBoardInsert(data: TripHistoryData) {

    println("seven..")


     println("data= " + data.asset_id)

   var Seq2=sc.parallelize(Seq(data.service_id,data.asset_id,"odometer", "calculated",data.odometer,new Date(System.currentTimeMillis()), new Date(System.currentTimeMillis()), data.asset_serial_no))


   import sparkSession.implicits._
val df1 = Seq2.toDF("data.service_id","data.asset_id","odometer", "calculated","data.odometer","new Date(System.currentTimeMillis())","new Date(System.currentTimeMillis())", "data.asset_serial_no"))
 -----------------------------------------------------------------------------
 Error:

  value toDF is not a member of org.apache.spark.rdd.RDD[Comparable[_ >: java.util.Date with String with Long <: Comparable[_ >: java.util.Date with String with Long <: java.io.Serializable] with java.io.Serializable] with java.io.Serializable]

请帮我解决问题。

标签: scalaapache-sparkapache-spark-sql

解决方案


您创建内部包含不同类型元素的数据框。

Seq(data.service_id,data.asset_id,"odometer", "calculated",data.odometer,new Date(System.currentTimeMillis()), new Date(System.currentTimeMillis()), data.asset_serial_no)Seq[Any],但你需要Seq在里面有元组。

你应该写:

val tuple = (data.service_id,data.asset_id,"odometer", "calculated",data.odometer,new Date(System.currentTimeMillis()), new Date(System.currentTimeMillis()), data.asset_serial_no)
val local = Seq(tuple)
var distrebuted = sc.parallelize(localSeq)
val df = distrebuted.toDF("data.service_id","data.asset_id","odometer", "calculated","data.odometer","new Date(System.currentTimeMillis())","new Date(System.currentTimeMillis())", "data.asset_serial_no")

推荐阅读