首页 > 解决方案 > 为列表的每个元素添加引号

问题描述

我需要使用我之前在 spark 中创建的变量从 teradata 表中选择数据:

%spark
sqlContext.setConf("spark.sql.parquet.writeLegacyFormat", "true")
val query = "select distinct cod_contrato from xxx.contratos"
val df = sqlContext.sql(query)
val dfv = df.select("cod_contrato")

变量是一个字符串。

所以我想查询使用该字符串向量的数据:

如果我使用:

%spark

val sql = s"(SELECT * FROM xx2.CONTRATOS where cod_contrato in '$dfv') as query"

我得到:

(SELECT * FROM xx2.CONTRATOS where cod_contrato in '[cod_contrato: string]') as query

期望的结果是:

SELECT * FROM xx2.CONTRATOS where cod_contrato in ('11111', '11112' )

如何将向量转换为由 () 括起来并在每个元素中加上引号的列表?

谢谢

标签: scalaapache-sparkapache-spark-sql

解决方案


这是我的审判。从一些数据框中,

val test = df.select("id").as[String].collect
> test: Array[String] = Array(6597, 8011, 2597, 5022, 5022, 6852, 6852, 5611, 14838, 14838, 2588, 2588)

所以测试现在是数组。因此,通过使用 mkString,

val sql = s"SELECT * FROM xx2.CONTRATOS where cod_contrato in " + test.mkString("('", "','", "')") + " as query"
> sql: String = SELECT * FROM xx2.CONTRATOS where cod_contrato in ('6597','8011','2597','5022','5022','6852','6852','5611','14838','14838','2588','2588') as query

最终结果现在是字符串。


推荐阅读