docker - Spark Streaming Kafka 集成问题
问题描述
我在 Windows 机器中使用 docker 来完成我的示例 Spark + Kafka 项目。我面对
Failed to find data source: kafka. Please deploy the application as per the deploym ent section of "Structured Streaming + Kafka Integration Guide".;
构建.sbt
lazy val root = (project in file(".")).
settings(
inThisBuild(List(
version := "0.1.0",
scalaVersion := "2.12.2",
assemblyJarName in assembly := "sparktest.jar"
)),
name := "sparktest",
libraryDependencies ++= List(
"org.apache.spark" %% "spark-core" % "2.4.0",
"org.apache.spark" %% "spark-sql" % "2.4.0",
"org.apache.spark" %% "spark-streaming" % "2.4.0",
"org.apache.spark" %% "spark-sql-kafka-0-10" % "2.4.0" % "provided",
"org.apache.kafka" %% "kafka" % "2.1.0",
"org.scalatest" %% "scalatest" % "3.0.5",
"com.typesafe.scala-logging" %% "scala-logging" % "3.9.0"
),
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.9.8",
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.9.8",
dependencyOverrides += "com.fasterxml.jackson.module" % "jackson-module-scala_2.12" % "2.9.8")
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs@_*) => MergeStrategy.discard
case x => MergeStrategy.first
}
代码
val inputStreamDF = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "kafka:9092")
.option("subscribe", "test1")
.option("startingOffsets", "earliest")
.load()
有没有人遇到过类似的问题,你是怎么解决的?
解决方案
推荐阅读
- python - python删除超过x天的文件夹
- java - 用 Java 连接到 LDAP 服务器时的通道绑定 - 可能吗?
- python - 如何在不匹配索引的情况下更新列表值
- swift - TextView 多行打印
- wagtail - 有什么方法可以将自定义唯一标识符添加到 w 块
- javascript - 数组在 Redux 上的状态更改时未定义
- javascript - System.Data.SqlClient.SqlException:'必须声明标量变量“@TxtTitle”。'
- tensorflow - 如何制作 tf.transform (Tensorflow Transform) 编码的字典?
- r - 在 R 中使用不同级别参数计数
- python-3.x - python matplotlib底图系统崩溃问题中的多图单色条