首页 > 解决方案 > 从 Spark 连接到 sql 数据库

问题描述

我正在尝试从 spark 连接到 SQL 数据库,并且我使用了以下命令:

scala> import org.apache.spark.sql.SQLContext                                                                                                                                                                      
import org.apache.spark.sql.SQLContext

scala> val sqlcontext = new org.apache.spark.sql.SQLContext(sc)                                                                                                                                                    
warning: there was one deprecation warning; re-run with -deprecation for details
sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@2bf4fa1

scala> val dataframe_mysql = sqlcontext.read.format("jdbc").option("url", "jdbc:sqlserver:192.168.103.64/DRE").option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable", "NCentralAlerts")
.option("user", "sqoop").option("password", "hadoop").load()
java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
  at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
  at scala.Option.foreach(Option.scala:257)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:79)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
  ... 49 elided

我看到 Spark 正在寻找 SQL 驱动程序。我将该 SQL 驱动程序放在哪个目录中?

标签: apache-spark

解决方案


我从日志中看到您正在尝试使用 spark shell 运行它。假设你手头有罐子。spark-shell从以下添加开始

spark-shell --jars /path/to/driver.jar

这样,它将被添加到您的类路径中,您将能够使用驱动程序。

希望这可以帮助


推荐阅读