首页 > 解决方案 > 从字符串列表中的列向 udf 传递参数

问题描述

我有一个字符串列表,它们代表数据框中的列名。我想将这些列中的参数传递给 udf。我怎样才能在 spark scala 中做到这一点?

   val actualDF = Seq(
             ("beatles", "help|hey jude","sad",4),
             ("romeo", "eres mia","old school",56)
            ).toDF("name", "hit_songs","genre","xyz")


   val column_list: List[String] = List("hit_songs","name","genre")

   // example udf
   val testudf = org.apache.spark.sql.functions.udf((s1: String, s2: String) => {
     // lets say I want to concat all values
   })


   val finalDF = actualDF.withColumn("test_res",testudf(col(column_list(0))))

从上面的示例中,我想将我的列表传递column_list给 udf。我不确定如何传递代表列名的完整字符串列表。虽然在 1 个元素的情况下,我看到我可以用col(column_list(0))). 请支持。

标签: scalaapache-sparkapache-spark-sql

解决方案


hit_songs是类型Seq[String],您需要将 udf 的第一个参数更改为Seq[String].

scala> singersDF.show(false)
+-------+-------------+----------+
|name   |hit_songs    |genre     |
+-------+-------------+----------+
|beatles|help|hey jude|sad       |
|romeo  |eres mia     |old school|
+-------+-------------+----------+
scala> actualDF.show(false)
+-------+----------------+----------+
|name   |hit_songs       |genre     |
+-------+----------------+----------+
|beatles|[help, hey jude]|sad       |
|romeo  |[eres mia]      |old school|
+-------+----------------+----------+
scala> column_list
res27: List[String] = List(hit_songs, name)

在下面更改您的UDF喜欢。

// s1 is of type Seq[String]
val testudf = udf((s1:Seq[String],s2:String) => {
    s1.mkString.concat(s2)
})

申请UDF

scala> actualDF
.withColumn("test_res",testudf(col(column_list.head),col(column_list.last)))
.show(false)
+-------+----------------+----------+-------------------+
|name   |hit_songs       |genre     |test_res           |
+-------+----------------+----------+-------------------+
|beatles|[help, hey jude]|sad       |helphey judebeatles|
|romeo  |[eres mia]      |old school|eres miaromeo      |
+-------+----------------+----------+-------------------+

没有 UDF

scala> actualDF.withColumn("test_res",concat_ws("",$"name",$"hit_songs")).show(false) // Without UDF.
+-------+----------------+----------+-------------------+
|name   |hit_songs       |genre     |test_res           |
+-------+----------------+----------+-------------------+
|beatles|[help, hey jude]|sad       |beatleshelphey jude|
|romeo  |[eres mia]      |old school|romeoeres mia      |
+-------+----------------+----------+-------------------+

推荐阅读