首页 > 解决方案 > Spark Scala: "cannot resolve symbol saveAsTextFile (reduceByKey)" - IntelliJ Idea

问题描述

I suppose some dependencies are not defined in build.sbt file.

I've added library dependencies in build.sbt file, but still I'm getting this error mentioned from title of this question. Try to search for solution on the google but couldn't find it

My spark scala source code (filterEventId100.scala) :

package com.projects.setTopBoxDataAnalysis

import java.lang.System._
import java.text.SimpleDateFormat
import java.util.Date
import org.apache.spark.sql.SparkSession

object filterEventId100 extends App {


  if (args.length < 2) {
    println("Usage: JavaWordCount <Input-File> <Output-file>")
    exit(1)
  }

  val spark = SparkSession
    .builder
    .appName("FilterEvent100")
    .getOrCreate()

  val data = spark.read.textFile(args(0)).rdd


  val result = data.flatMap{line: String => line.split("\n")}
      .map{serverData =>
        val serverDataArray = serverData.replace("^", "::")split("::")
        val evenId = serverDataArray(2)
        if (evenId.equals("100")) {
          val serverId = serverDataArray(0)
          val timestempTo = serverDataArray(3)
          val timestempFrom = serverDataArray(6)
          val server = new Servers(serverId, timestempFrom, timestempTo)
          val res = (serverId, server.dateDiff(server.timestampFrom, server.timestampTo))
          res
        }


  }.reduceByKey{
    case(x: Long, y: Long) => if ((x, y) != null) {
         if (x > y) x else y
    }
  }

  result.saveAsTextFile(args(1))

  spark.stop


}

class Servers(val serverId: String, val timestampFrom: String, val timestampTo: String) {

  val DATE_FORMAT = "yyyy-MM-dd hh:mm:ss.SSS"

  private def convertStringToDate(s: String): Date = {
    val dateFormat = new SimpleDateFormat(DATE_FORMAT)
    dateFormat.parse(s)
  }

  private def convertDateStringToLong(dateAsString: String): Long = {
    convertStringToDate(dateAsString).getTime
  }

  def dateDiff(tFrom: String, tTo: String): Long = {
    val dDiff = convertDateStringToLong(tTo) - tFrom.toLong
    dDiff
  }

}

My build.sbt file:

name := "SetTopProject"
version := "0.1"
scalaVersion := "2.12.8"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "2.4.3" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy"),
  "org.apache.spark" %% "spark-sql_2.12" % "2.4.3" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy"),
  "org.apache.hadoop" %% "hadoop-common" % "3.2.0" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy"),
  "org.apache.spark" %% "spark-sql_2.12" % "2.4.3" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy"),
  "org.apache.spark" %% "spark-hive_2.12" % "2.4.3" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy"),
  "org.apache.spark" %% "spark-yarn_2.12" % "2.4.3" exclude ("org.apache.hadoop","hadoop-yarn-server-web-proxy")
)

I was expecting everything will be fine because

val spark = SparkSession
.builder
.appName("FilterEvent100")
.getOrCreate()

is defined well (without any compiler's errors) and I use spark value to define data value:

val data = spark.read.textFile(args(0)).rdd

which calls saveAsTextFile and reducedByKey functions:

val result = data.flatMap{line: String => line.split("\n")}...
}.reducedByKey {case(x: Long, y: Long) => if ((x, y) != null) {
     if (x > y) x else y
}
result.saveAsTextFile(args(1))

What I should to to remove compiler errors for saveAsTextFile and reduceByKey functions calls?

标签: scalaapache-sparkintellij-idea

解决方案


代替

 val spark = SparkSession
    .builder
    .appName("FilterEvent100")
    .getOrCreate()

  val data = spark.read.textFile(args(0)).rdd

val conf = new SparkConf().setAppName("FilterEvent100")
val sc = new SparkContext(conf)
val spark = SparkSession.builder.config(sc.getConf).getOrCreate()

val data = sc.textfile(args(0))

推荐阅读