首页 > 解决方案 > Python OLS with categorical label

问题描述

I have a dataset where I am trying to predict the type of car based off of a number of features. I would like to an OLS regression to see

import statsmodels.api as sm

X  = features 
# where 0 = sedan, 1 = minivan , etc 
y = [0,0,1,0,2,....]

X2 = sm.add_constant(np.array(X))
est = sm.OLS(np.array(y), X2)
est2 = est.fit()

^ I don't feel like doing this is correct because I am not specifying that it is categorical, I feel like the functional form should change. Was wondering if anyone had any insight on this.

标签: statsmodelscategorical-data

解决方案


Ordinary least squares regression assumes a numerical dependent variable, you cannot use it to predict categorical outcomes.

To predict categorical outcomes with a regression model, you want to use multinomial logistic regression, for example using sklearn.


推荐阅读