首页 > 解决方案 > Spark - 从数据库中获取表

问题描述

我必须对给定数据库中的所有表执行操作,因此我使用以下代码。

但是,它也给了我视图,有没有办法只过滤表?

代码

  def getTables(databaseName: String)(implicit spark: SparkSession): Array[String] = {
    val tables = spark.sql(s"show tables from ${databaseName}").collect().map(_(1).asInstanceOf[String])
    logger.debug(s"${tables.mkString(",")} found")
    tables
  }

另外,“显示视图显示错误”

scala> spark.sql("show views from gshah03;").show
org.apache.spark.sql.catalyst.parser.ParseException:
missing 'FUNCTIONS' at 'from'(line 1, pos 11)

== SQL ==
show views from gshah03;
-----------^^^

  at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
  ... 49 elided

标签: apache-sparkapache-spark-sql

解决方案


尝试这个-

  val df = spark.range(1, 5)
    df.createOrReplaceTempView("df_view")
    println(spark.catalog.currentDatabase)
    val db: Database = spark.catalog.getDatabase(spark.catalog.currentDatabase)
    val tables: Dataset[Table] = spark.catalog.listTables(db.name)
    tables.show(false)

    /**
      * default
      * +-------+--------+-----------+---------+-----------+
      * |name   |database|description|tableType|isTemporary|
      * +-------+--------+-----------+---------+-----------+
      * |df_view|null    |null       |TEMPORARY|true       |
      * +-------+--------+-----------+---------+-----------+
      */

推荐阅读