首页 > 解决方案 > 调用 o1964.collectToPython 时出错。:org.apache.spark.SparkException:作业因阶段失败而中止:任务0

问题描述

我正在尝试将 spark RDD 转换为 Pandas DataFrame。

from pyspark.ml.regression import GBTRegressor

gbt = GBTRegressor(featuresCol="features",labelCol="Price", maxIter=10)
gbtModel = gbt.fit(training_data)
predictions_gbt = gbtModel.transform(testing_data)

predictions_gbt.select("features", "Price", "prediction").show()

prediction_gbt_test = gbtModel.transform(finalized_test_data)

prediction_gbt_test.toPandas()

此代码产生错误:-

Py4JJavaError: An error occurred while calling o1964.collectToPython.: 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 166.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 166.0 (TID 166, 86f0177ce5fa, executor driver): 
org.apache.spark.SparkException: Failed to execute user defined 
function(GBTRegressionModel$$Lambda$3519/181923952: 

任何人都可以帮助我解决这种“工作中止”的错误。

标签: python-3.xpandasmachine-learningpysparkapache-spark-ml

解决方案


推荐阅读