python - 使用pandas在python中为数据框的每一行查找最小二乘线性回归
问题描述
我有一个数据框:
(日月年)
df = pd.DataFrame({'Name': ['A', 'B', 'C'],
'Date0': ['01/01/1999','01/06/1999','01/01/1979'], 'V0': [29,44,21],
'Date1': ['08/01/2000','07/01/2000','01/01/2000'],'V1': [35, 45, 47]})
我想插入每一行的年龄以使用线性回归找到 'V_10',它是 1999 年 10 月 8 日日期的值。例如,在第一种情况下,我会得到类似的东西:
Slope 0.01609
Y-intercept 29.00
df = pd.DataFrame({'Name': ['A', 'B', 'C'],
'Date0': ['01/01/1999','01/06/1999','01/01/1979'], 'V0': [29,44,21],
'Date1': ['08/01/2000','07/01/2000','01/01/2000'],'V1': [35, 45, 47],
'V_10':[32.57]})
我希望我的计算是正确的。
如果我想要指数回归或更糟糕的是我拥有的自定义函数怎么办?
解决方案
我不确定这是否是您所追求的,但对于线性插值,您可以执行以下操作:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
df = pd.DataFrame({'Name': ['A', 'B', 'C'],
'Date0': ['01/01/1999','01/06/1999','01/01/1979'], 'V0': [29,44,21],
'Date1': ['08/01/2000','07/01/2000','01/01/2000'],'V1': [35, 45, 47]})
df['Target'] = pd.to_datetime('10/08/1999')
df['Date0'] = pd.to_datetime(df['Date0'])
df['Date1'] = pd.to_datetime(df['Date1'])
df['Target'] = pd.to_datetime(df['Target'])
def regress(xs, ys, newx, reference=pd.to_datetime('1/1/1900'), retype='linear', fit_intercept=True, degree=None):
xs = [(x - reference).days for x in xs]
xs = np.array(xs).reshape(-1,1)
ys = np.array(ys)
if retype == 'linear':
lm = LinearRegression(fit_intercept=fit_intercept)
elif retype == 'polynomial':
lm = Pipeline([('poly', PolynomialFeatures(degree=degree)),
('linear', LinearRegression(fit_intercept=fit_intercept))])
else:
return print('Need to specify other regression type.')
lm.fit(xs,ys)
return lm.predict(np.array((newx - reference).days).reshape(-1, 1))[0]
# Linear regression example
df['V10'] = df.apply(lambda x: regress([x.Date0,x.Date1], [x.V0,x.V1], x.Target, retype='linear'), axis=1)
# 2nd-degree polynomial regression example
df['V11']=df.apply(lambda x: regress([x.Date0,x.Date1], [x.V0,x.V1], x.Target, retype='polynomial', degree=2), axis=1)
推荐阅读
- c++ - 在另一个类中创建类实例的问题
- corda - Corda 账户功能疑点
- google-app-engine - 如何从 App Engine 应用程序连接到 Google Cloud Composer Metadata DB(Airflow metadata DB - Cloud SQL)
- mysql - SQL 外键和主键
- ruby-on-rails - Rails ActiveRecord 等效于 Laravel ORM `attach()` 方法,用于多态 has_many :through
- elasticsearch - Java regexp 到 Lucene 转换器?
- python - 根据另一列中的值复制 PySpark Dataframe 中的行并获得订单
- wso2 - 如何为这个角色添加权限?
- azure-ad-b2c - 如何使用共享邮件预先创建用户以通过 Graph API 重置密码
- cordova - 添加 cordova-plugin-wkwebview-engine 时,Mobilefirst 8 cordova 应用程序崩溃