if-statement - 如何在 PySpark 中执行嵌套的 When else ?
问题描述
大家好,我试图解释这个 PowerBi 语法并将其转换为 Pyspark
if(UCS_Incidents[Intensity]="Very High",
IF(UCS_Incidents[Severity]="Very High","Red",
IF(UCS_Incidents[Severity]="High","Red",
IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),
if(UCS_Incidents[Intensity]="High",
IF(UCS_Incidents[Severity]="Very High","Red",
IF(UCS_Incidents[Severity]="High","Orange",
IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),
if(UCS_Incidents[Intensity]="Medium",
IF(UCS_Incidents[Severity]="Very High","Orange",
IF(UCS_Incidents[Severity]="High","Yellow",
IF(UCS_Incidents[Severity]="Medium","Yellow","Green"))),
if(UCS_Incidents[Intensity]="Low",
IF(UCS_Incidents[Severity]="Very High","Yellow",
IF(UCS_Incidents[Severity]="High","Green",
IF(UCS_Incidents[Severity]="Medium","Green","Green"))),
""))))
这就是我尝试过的:
Intensities = df.withColumn(('Intensities',f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Very High') , "Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'High') , "Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Medium') , "Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Very High') , "Red").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'High') , "Orange").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Medium') , "Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Very High') , "Orange").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'High') , "Yellow").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Medium') , "Yellow")
.otherwise('Green'))))
.otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Very High') , "Yellow").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'High') , "Green").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Medium') , "Green")
.otherwise('Green'))))
).otherwise("")
但是,我得到了这个错误:
A Tuple Object dosen't have an attribute Otherwise
任何帮助将不胜感激,谢谢
解决方案
只是举例说明@jxc 的含义:假设您已经有一个名为 df 的数据框:
from pyspark.sql.functions import expr
Intensities = df.withColumn('Intensities', expr("CASE WHEN Intensity = 'Very High' AND Severity = 'Very High' THEN 'Red' WHEN .... ELSE ... END"))
我把“...”作为占位符,但我认为它使方法清晰。
推荐阅读
- ios - 滚动视图中的按钮中心
- c# - 如何从 LDAP 获取组织单位下所有对象的列表?
- vue.js - Vuejs 和 Webpack:为什么在子组件中未定义存储
- r - 获取符合特定条件的行数并添加到 R 中的列
- ios - 调用 AudioKit.start() 后动态连接节点
- python-3.x - 如何让 SMAC3 在 Windows 上为 Python 3x 工作
- powershell - 为什么对虚假数组属性的逻辑测试返回 true?
- bash - 在 unix 中压缩特定文件时出错
- reactjs - React Native 错误 - 使用 $object 时,这不是一个函数
- complexity-theory - “if”语句使用与直接使用逻辑运算符的比较