首页 > 解决方案 > 如何在 PySpark 中执行嵌套的 When else ?

问题描述

大家好,我试图解释这个 PowerBi 语法并将其转换为 Pyspark

 if(UCS_Incidents[Intensity]="Very High",
 IF(UCS_Incidents[Severity]="Very High","Red",
 IF(UCS_Incidents[Severity]="High","Red",
 IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),

 if(UCS_Incidents[Intensity]="High",
 IF(UCS_Incidents[Severity]="Very High","Red",
 IF(UCS_Incidents[Severity]="High","Orange",
 IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),

 if(UCS_Incidents[Intensity]="Medium",
 IF(UCS_Incidents[Severity]="Very High","Orange",
 IF(UCS_Incidents[Severity]="High","Yellow",
 IF(UCS_Incidents[Severity]="Medium","Yellow","Green"))),

 if(UCS_Incidents[Intensity]="Low",
 IF(UCS_Incidents[Severity]="Very High","Yellow",
 IF(UCS_Incidents[Severity]="High","Green",
 IF(UCS_Incidents[Severity]="Medium","Green","Green"))),

 ""))))

这就是我尝试过的:

 Intensities = df.withColumn(('Intensities',f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Very High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Medium') , "Orange")
                        .otherwise('Yellow'))))
                        .otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Very High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'High') , "Orange").
                        otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Medium') , "Orange")
                        .otherwise('Yellow'))))
                        .otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Very High') , "Orange").
                        otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'High') , "Yellow").
                        otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Medium') , "Yellow")
                        .otherwise('Green'))))
                        .otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Very High') , "Yellow").
                        otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'High') , "Green").
                        otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Medium') , "Green")
                        .otherwise('Green'))))

                        ).otherwise("")

但是,我得到了这个错误:

  A Tuple Object dosen't have an attribute Otherwise

任何帮助将不胜感激,谢谢

标签: if-statementpysparkcase-when

解决方案


只是举例说明@jxc 的含义:假设您已经有一个名为 df 的数据框:

from pyspark.sql.functions import expr

Intensities = df.withColumn('Intensities', expr("CASE WHEN Intensity = 'Very High' AND Severity = 'Very High' THEN 'Red' WHEN .... ELSE ... END"))

我把“...”作为占位符,但我认为它使方法清晰。


推荐阅读