scala - 生成包含问题 ID(作为列)和答案(作为它们的值)的考试评估报告的代码
问题描述
我需要编写代码来生成考试评估报告(给定多个学生的答案及其尝试),其中包括问题 ID(作为列)和答案(作为它们的值)。我需要注意的是,学生(参与者)可以在单个评估和地理标签中回答一个或多个问题。
/** Input data */
val inputDf = Seq(
(1, "Question1Text", "Yes", "abcde1", 0, List("x1", "y1")),
(2, "Question2Text", "No", "abcde1", 0, List("x1", "y1")),
(3, "Question3Text", "3", "abcde1", 0, List("x1", "y1")),
(1, "Question1Text", "No", "abcde2", 0, List("x2", "y2")),
(2, "Question2Text", "Yes", "abcde2", 0, List("x2", "y2"))
).toDF("Qid", "Question", "AnswerText", "ParticipantID", "Assessment", "GeoTag")
println("Input:")
inputDf.show(false)
我的解决方案是:
inputDf
.groupBy($"ParticipantID")
.pivot("Question")
.agg(first($"ParticipantID"))
.sort($"ParticipantID")
但这就是我得到的:
Input:
+---+-------------+----------+-------------+----------+--------+
|Qid|Question |AnswerText|ParticipantID|Assessment|GeoTag |
+---+-------------+----------+-------------+----------+--------+
|1 |Question1Text|Yes |abcde1 |0 |[x1, y1]|
|2 |Question2Text|No |abcde1 |0 |[x1, y1]|
|3 |Question3Text|3 |abcde1 |0 |[x1, y1]|
|1 |Question1Text|No |abcde2 |0 |[x2, y2]|
|2 |Question2Text|Yes |abcde2 |0 |[x2, y2]|
+---+-------------+----------+-------------+----------+--------+
Expected:
+-------------+----------+--------+-----+-----+-----+
|ParticipantID|Assessment|GeoTag |Qid_1|Qid_2|Qid_3|
+-------------+----------+--------+-----+-----+-----+
|abcde1 |0 |[x1, y1]|Yes |No |3 |
|abcde2 |0 |[x2, y2]|No |Yes |null |
+-------------+----------+--------+-----+-----+-----+
Actual:
+-------------+-------------+-------------+-------------+
|ParticipantID|Question1Text|Question2Text|Question3Text|
+-------------+-------------+-------------+-------------+
|abcde1 |abcde1 |abcde1 |abcde1 |
|abcde2 |abcde2 |abcde2 |null |
+-------------+-------------+-------------+-------------+
解决方案
你应该使用
inputDf
.groupBy($"ParticipantID", $"Assessment", $"GeoTag")
.pivot("Question")
.agg(first($"AnswerText"))
.sort($"ParticipantID")
.show(false)
您可以稍后根据需要重命名该列。
输出:
+-------------+----------+--------+-------------+-------------+-------------+
|ParticipantID|Assessment|GeoTag |Question1Text|Question2Text|Question3Text|
+-------------+----------+--------+-------------+-------------+-------------+
|abcde1 |0 |[x1, y1]|Yes |No |3 |
|abcde2 |0 |[x2, y2]|No |Yes |null |
+-------------+----------+--------+-------------+-------------+-------------+
推荐阅读
- python - 在 django admin 中显示 ManyToMany 关系的值
- python - 如果我安装了 python3 和 Kivy,为什么会显示语法错误?
- c++ - 如果构造函数抛出 RAII 和成员?
- java - 如何从更深的层将对象绑定到数组 1
- python - 为什么 a.insert(0,0) 比 a[0:0]=[0] 慢得多?
- ant - 如何显示蚂蚁模式集
- java - Android Studio 导入新模块时项目不显示在项目结构中
- javascript - 将块作用域中的变量值用作全局作用域中的简单变量 - JAVASCRIPT。在火灾商店
- geometry - 几何找到从正方形的边缘到该正方形内最大可能圆的边缘的距离
- c - C 中的内存管理 - Linux 内核