首页 > 解决方案 > 在 pyspark 中调整二项式 Logistic 回归参数;

问题描述

我尝试在 pyspark 中调整 Tuning Binomial Logistic Regression 参数的参数,但结果完全没有改变 Fist 参数

第二个参数

第一个没有参数的逻辑回归模型。

from pyspark.ml.classification import LogisticRegression

train_data, test_data = pipe_df.randomSplit([0.7,0.3])
print("Training Dataset Count: " + str(train_data.count()))
print("Test Dataset Count: " + str(test_data.count()))

# First Logistic regression model without parameters.
lr_model = LogisticRegression(featuresCol='features',labelCol='state')

lr_model = lr_model.fit(train_data)

results = lr_model.transform(test_data)
evaluator = MulticlassClassificationEvaluator(
    labelCol="state", predictionCol="prediction", metricName="accuracy")
print ("Test set accuracy = " + str(accuracy))

准确性

测试集准确率 = 0.6401755241345685

具有新参数的第二个逻辑回归模型。

mlr = LogisticRegression(featuresCol='features',labelCol='state', maxIter=60, regParam=0.8, elasticNetParam=0.8, family="multinomial")

lrModel = mlr.fit(train_data)

results2 = lrModel.transform(test_data)
evaluator = MulticlassClassificationEvaluator(labelCol="state", predictionCol="prediction", metricName="accuracy")
accuracy = evaluator.evaluate(results2)
print ("Test set accuracy = " + str(accuracy))

准确性

测试集准确率 = 0.6401755241345685

标签: pythonpyspark

解决方案


推荐阅读