首页 > 解决方案 > 用于制作和输出多个线性回归的循环

问题描述

我有多个要构建的模型。我正在寻找一种方法来遍历我的列表以制作和打印每个模型的摘要。

x = data[["Prod Order Quantity", "Print Type Complexity " ]]
y1 = data["Per Unit Downtime and setup"]
y2 = data["Per Unit Runtime"]
y3 = data["Setup and downtime"]
y4 = data["Total Runtime"]
y5 = data["Total Variable Cost over Runtime"] 

#add constant list
Xc = sm.add_constant(x)
#make linear regression for 1 outcome variable at a time
model = sm.OLS(y1, Xc)
results = model.fit()
print(results.summary())

#loop to do it for each variable in my list 
all_outcomes = [y1,y2,y3,y4,y5]
def all_models(variable_list): 
    for v in all_outcomes:
        model = sm.OLS[v,Xc]
        results = model.fit
        print(results.summary())

all_models(all_outcomes)

错误

TypeError: 'type' object is not subscriptable

标签: pythonpython-3.xdata-sciencestatsmodels

解决方案


您需要sm.OLS(v,Xc)并且理想情况下使用定义的变量。像下面这样的东西会起作用,我首先设置一个像你这样的示例数据:

import numpy as np
import pandas as pd
import statsmodels.api as sm

data = pd.DataFrame(np.random.normal(0,1,(100,7)))
data.columns = ["Prod Order Quantity", "Print Type Complexity","Per Unit Downtime and setup","Per Unit Runtime","Setup and downtime","Total Runtime","Total Variable Cost over Runtime"]

定义一个函数,在这种情况下,它返回一个结果列表:

def all_models(variable_list,df):
    #store results
    allresults = [] 
    
    for v in variable_list:
        Xc = sm.add_constant(df[["Prod Order Quantity", "Print Type Complexity"]])
        model = sm.OLS(df[v],Xc)
        results = model.fit()
        print(results.summary())
        allresults.append(results)

    return allresults

运行:

all_outcomes = ["Per Unit Downtime and setup","Per Unit Runtime","Setup and downtime","Total Runtime","Total Variable Cost over Runtime"]

res = all_models(all_outcomes,data)

推荐阅读