首页 > 解决方案 > 多目标变量回归

问题描述

我有一个数据框,这些是我的预测器。

min_time   max_time cluster_label  Day  Week
6000       9000           2        0    3
7000       9000           1        3    3
3000       5300           3        2    4
5000       6000           2        5    4
..

使用这些特征,我需要预测 4 个特征(目标变量或 y1,y2,y3,y4)

route_count   Delivieres          Distance     TotalTime

18           22                    290           3500
22           21                    334           5400
19           23                    503           3900
20           44                    674           4000
21           45                    398           6600

我怎样才能做到这一点?这是我迄今为止尝试过的,但我不确定随机森林是否可以输出多个变量的预测

from sklearn.metrics import,accuracy_score,mean_absolute_error,mean_squared_error
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
X_train, X_test, y_train, y_test = train_test_split( 
                        X, y, test_size = 0.30, random_state = 101)
rfg = RandomForestRegressor(n_estimators=100,criterion="mae")
rfg.fit(X_train, y_train)
y_pred = rfg.predict(X_test)
rfg.score(X_test, y_test)

标签: pythonmachine-learningscikit-learnregression

解决方案


MultiOutputRegressor可以做到这一点。只需将其用作包装器即可。

from sklearn.metrics import,accuracy_score,mean_absolute_error,mean_squared_error
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.multioutput import MultiOutputRegressor
X_train, X_test, y_train, y_test = train_test_split( 
                        X, y, test_size = 0.30, random_state = 101)
rfg = MultiOutputRegressor(RandomForestRegressor(n_estimators=100,criterion="mae"))
rfg.fit(X_train, y_train)
y_pred = rfg.predict(X_test)
rfg.score(X_test, y_test)

推荐阅读