首页 > 解决方案 > No TypeTag found in Scala Spark. Writing a method to get structype inside a trait

问题描述

I have a trait which gives me functionalities to read CSV files.

import org.apache.spark.sql.catalyst.ScalaReflection
import org.apache.spark.sql.catalyst.ScalaReflection.universe.TypeTag
import org.apache.spark.sql.types.StructType

trait InputCsvData extends Serializable {
  // Path to the CSV file.
  val csvRootPath: String

  // Schema for the CSV file.
  type DataSchema

  // Get StructType
  def getStructType(implicit evidence: ScalaReflection.universe.TypeTag[DataSchema]) = 
  ScalaReflection.schemaFor[DataSchema].dataType.asInstanceOf[StructType]

  // Get DataFrame
  def getAsDF(sc: SparkSession,
              rootPath: String = csvRootPath,
              deliminator: String = ",",
              header: Boolean = true): DataFrame = {
      val df = sc.read 
        // Here I'm trying to read the schema using the function getStructType as structype.
        .schema(getStructType)
        .option("header", header)
        .option("mode", "FAILFAST")
        .csv(rootPath)    
  }
}

Now when I extend this trait with objects with a case class DataSchema referring to the schema(I'm keeping this as case class as I would like to use case class to convert DataFrame to Datasets if required).

object NetworkOutput extends Serializable with InputCsvData {

  override val csvRootPath: String = RunConfigs.getPath("network_output_fact")

  def main(args: Array[String]): Unit = {
    // Trying to get the data here.
    getAsDF(Sparker.getSparkSession).show
  }

  case class DataSchema(category: String,
                        destination_city: String,
                        vertical: String)
}

I'm getting Error as given below.

Error:(45, 17) No TypeTag available for InputOrcData.this.DataSchema
        .schema(getStructType)
Error:(45, 17) not enough arguments for method getStructType: (implicit evidence: org.apache.spark.sql.catalyst.ScalaReflection.universe.TypeTag[InputOrcData.this.DataSchema])org.apache.spark.sql.types.StructType.
Unspecified value parameter evidence.
        .schema(getStructType)

I have been trying to solve this for sometime. Can anyone help me understand what is wrong with this code?

I'm doing a anti-pattern with traits?

标签: scalaapache-spark

解决方案


推荐阅读