首页 > 解决方案 > 使用 sbt for Spark 打包 JAR 文件

问题描述

我正在关注他们的入门页面:https : //spark.apache.org/docs/latest/quick-start.html 在倒数第二个代码片段中,我必须使用 sbt 将我的文件打包成 .jar

我的 build.sbt:

name := "Simple Project"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" % "spark-sql" % "2.3.2"

我的 SimpleApp.scala:

/* SimpleApp.scala */
import org.apache.spark.sql.SparkSession

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "/usr/local/spark/README.md" // Should be some file on your system
    val spark = SparkSession.builder.appName("Simple Application").getOrCreate()
    val logData = spark.read.textFile(logFile).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println(s"Lines with a: $numAs, Lines with b: $numBs")
    spark.stop()
  }
}

我将文件放在正确的目录中,如请求并sbt package从 /usr/local/spark/examples/ where build.run 运行。

然后我继续得到这个很长的错误:

test@test-ThinkPad-X230:/usr/local/spark/examples$ sbt package
[info] Loading project definition from /usr/local/spark/examples/project
[info] Loading settings for project examples from build.sbt ...
[info] Set current project to Simple Project (in build file:/usr/local/spark/examples/)
[info] Updating ...
[info] downloading https://repo1.maven.org/maven2/org/apache/avro/avro/1.7.7/avro-1.7.7.jar ...
[info]  [SUCCESSFUL ] org.apache.avro#avro;1.7.7!avro.jar (322ms)
[info] Done updating.
[info] Compiling 188 Scala sources and 125 Java sources to /usr/local/spark/examples/target/scala-2.11/classes ...
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalFileLR.scala:23:8: not found: object breeze
[error] import breeze.linalg.{DenseVector, Vector}
[error]        ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalFileLR.scala:39:19: not found: type DenseVector
[error]     DataPoint(new DenseVector(nums.slice(1, D + 1)), nums(0))
[error]                   ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalFileLR.scala:60:13: not found: value DenseVector
[error]     val w = DenseVector.fill(D) {2 * rand.nextDouble - 1}
[error]             ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalFileLR.scala:65:22: not found: value DenseVector
[error]       val gradient = DenseVector.zeros[Double](D)
[error]                      ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalKMeans.scala:26:8: not found: object breeze
[error] import breeze.linalg.{squaredDistance, DenseVector, Vector}
[error]        ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalKMeans.scala:42:27: not found: type DenseVector
[error]   def generateData: Array[DenseVector[Double]] = {
[error]                           ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalKMeans.scala:43:32: not found: type DenseVector
[error]     def generatePoint(i: Int): DenseVector[Double] = {
[error]                                ^
[error] /usr/local/spark/examples/src/main/scala/org/apache/spark/examples/LocalKMeans.scala:44:7: not found: value DenseVector
[error]       DenseVector.fill(D) {rand.nextDouble * R}
[error]       ^

它一直在继续。我无法说出我做错了什么。

标签: scalaapache-sparkjarsbtpackaging

解决方案


发现问题: build.sbt 应该是 in/spark/而不是/spark/examples


推荐阅读