首页 > 解决方案 > Select a literal based on a column value in Spark

问题描述

I have a map:

val map = Map("A" -> 1, "B" -> 2)

And I have a DataFrame. a column in the data frame contains the keys in the map. I am trying to select a column in a new DF that has the map values in it based on the key:

val newDF = DfThatContainsTheKeyColumn.select(concat(col(SomeColumn), lit("|"),
    lit(map.get(col(ColumnWithKey).toString()).get) as newColumn)

But this is resulting in the following error:

java.lang.RuntimeException: Unsupported literal type class scala.None$ None

I made sure that the column ColumnWithKey has As and Bs only and does not have empty values in it.

Is there another way to get the result I am looking for? Any help would be appreciated.

标签: scalaapache-spark

解决方案


此语句中的问题(语法问题除外)

val newDF = DfThatContainsTheKeyColumn.select(concat(col(SomeColumn), lit("|"),
    lit(map.get(col(ColumnWithKey).toString()).get) as newColumn)

col(ColumnWithKey)不会取特定行的值,而只是由模式给出,即具有恒定值。

在您的情况下,我建议将您的地图加入您的数据框:

val map = Map("A" -> 1, "B" -> 2)
val df_map = map.toSeq.toDF("key","value")

val DfThatContainsTheKeyColumn = Seq(
  "A",
  "A",
  "B",
  "B"
).toDF("myCol")


DfThatContainsTheKeyColumn
  .join(broadcast(df_map),$"mycol"===$"key")
  .select(concat($"mycol",lit("|"),$"value").as("newColumn"))
  .show()

|newColumn|
+---------+
|      A|1|
|      A|1|
|      B|2|
|      B|2|
+---------+

推荐阅读