在DataFrame的列之间运行OLS回归



我想定义一个函数,在数据帧的每一列和最后一列之间运行OLS模型。例如,我有一个有13列的数据帧,所以我必须运行OLS回归12次,而且写起来太多了。

import pandas as pd
from sklearn import linear_model

DF = pd.read_excel('data.xlsx')
print(DF)
# Regression Model
for columns in DF:
reg = linear_model.LinearRegression()
reg.fit(DF[['INCOME']], DF.x)
reg1 = linear_model.LinearRegression()
reg1.fit(DF[['INCOME']], DF.FOOD)
reg2 = linear_model.LinearRegression()
reg2.fit(DF[['INCOME']], DF.SMOKING)
.
.
.
reg11 = linear_model.LinearRegression()
reg11.fit(DF[['INCOME']], DF.HOTEL)
reg12 = linear_model.LinearRegression()
reg12.fit(DF[['INCOME']], DF.OTHERS)
#Beta Coefficeints
B1 = reg1.coef_
B2 = reg2.coef_
.
B10 = reg10.coef_
B11 = reg11.coef_
B12 = reg12.coef_
print(B1)
print(B2)
.`
print(B10)
print(B11)
print(B12)

我只想缩短

您可以迭代列并将结果存储在字典中,即:

from sklearn import linear_model
dict = {}
for i in df.columns:
reg = linear_model.LinearRegression()
reg.fit(df[['INCOME']], df[i])
dict[i] = reg.coef_
print(dict[i])

最新更新