首页 > 解决方案 > Dataset usage in "MapGroupsWithState" of Spark SQL

问题描述

I have events with "id and Map[String, List]" data. I'm grouping these data by id. Then I calculate somethings with "mapgroupswithstate".

Can I use from_json() method in mapgroupswithstate? So, can I use dataset/dataframe in mapgroupswithstate?

For example;

df.groupBy().mapgroupswithstate{
   val anotherDF = events.toDF
   ... other operations...
}

标签: apache-sparkapache-spark-sqlspark-structured-streaming

解决方案


Can I use from_json() method in mapgroupswithstate? So, can I use dataset/dataframe in mapgroupswithstate?

Ans - Answer to both questions is No (loosely). Not in a standard way. When you are operating within mapgroupswithstate, then you are entering to executor level operations where you can write you custom code without dataframe abstraction.


推荐阅读