python - PySpark 代码引发：TypeError：float() 参数必须是字符串或数字

问题描述

我有以下 PySpark 代码，除了今天，它一直运行良好：

row_stats = dataframe
                 .withColumn("exploded" , explode(col("products")))
                 .withColumn("score", col("exploded").getItem(target_field))
                 .where(col("score").isNotNull())
                 .select(mean_(col("score")).alias("mean"),stddev_(col("score")).alias("stddev")).first()

mean = 0
std = 0
if row_stats is not None:
    print "row_stats.mean"
    print row_stats.mean
    mean = Decimal(float(row_stats.mean))
    std = Decimal(float(row_stats.stddev))

我在该行收到错误mean = Decimal(float(row_stats.mean))：

TypeError: float() argument must be a string or a number

print输出：

<type 'NoneType'>
None

如何正确处理此错误以获取mean和std等于 0？

标签： pythonpyspark

python - PySpark 代码引发：TypeError：float() 参数必须是字符串或数字

问题描述

解决方案

推荐阅读