python - 从公式创建 statsmodels 模型时出现语法错误
问题描述
我有以下线性回归代码:
# building a base model
# INSTANTIATING a model type
lm_practice = smf.ols(formula = """ Open_AAL ~
High_AAL +
Low_AAL +
Close_AAL +
Adj Close_AAL +
Volume_AAL +
Open_SP +
High_SP +
Low_SP +
Close_SP +
Adj Close_SP+
Volume_SP
""",
data = fin)
# telling Python to FIT the data to the blueprint
results = lm_practice.fit()
# printing a summary of the results
print(results.summary())
但是有一个语法错误:
Traceback (most recent call last):
File "C:\Users\Home\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-24-0a39fdb04edd>", line 17, in <module>
data = fin)
File "C:\Users\Home\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 159, in from_formula
missing=missing)
File "C:\Users\Home\Anaconda3\lib\site-packages\statsmodels\formula\formulatools.py", line 65, in handle_formula_data
NA_action=na_action)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 310, in dmatrices
NA_action, return_type)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 165, in _do_highlevel_design
NA_action)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 70, in _try_incr_builders
NA_action)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\build.py", line 689, in design_matrix_builders
factor_states = _factors_memorize(all_factors, data_iter_maker, eval_env)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\build.py", line 354, in _factors_memorize
which_pass = factor.memorize_passes_needed(state, eval_env)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 474, in memorize_passes_needed
subset_names = [name for name in ast_names(self.code)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 474, in <listcomp>
subset_names = [name for name in ast_names(self.code)
File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 105, in ast_names
for node in ast.walk(ast.parse(code)):
File "C:\Users\Home\Anaconda3\lib\ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
Adj Close_SP
^
SyntaxError: invalid syntax
出了什么问题,我该如何解决?
解决方案
您需要替换列名中的空格,并在公式中使用更新后的列名,例如:
import statsmodels.formula.api as smf
import pandas as pd
import numpy as np
fin = pd.DataFrame({'Open_AAL':np.random.uniform(0,1,100),
'Adj Close_AAL':np.random.uniform(0,1,100),
'High_AAL':np.random.uniform(0,1,100)})
Open_AAL Adj Close_AAL High_AAL
0 0.260162 0.515144 0.995558
1 0.381395 0.187687 0.106275
2 0.016885 0.381614 0.797739
3 0.772720 0.388308 0.856932
fin.columns = fin.columns.str.replace(" ","_")
fin.columns
Index(['Open_AAL', 'Adj_Close_AAL', 'High_AAL'], dtype='object')
lm_practice = smf.ols("Open_AAL ~ Adj_Close_AAL + High_AAL",data = fin)
results = lm_practice.fit()
推荐阅读
- networking - websocket能保证单条消息的完整性吗?我应该为 websocket 消息实现自己的标头吗?
- laravel-5 - GuzzleHttp Laravel 登录 API 使用 GET 请求
- python - 如何从 vlc 媒体播放器获取帧
- amazon-web-services - 在 IBM AIX 上安装 Kinesis Firehose
- python - 获取数据帧 1 到数据帧 2 的索引
- php - 如何以 PHP 用户身份在 facebook 用户墙上发布
- c#-4.0 - 从 app.config 获取 configsection 数据
- ruby-on-rails-4 - 禁用 PATCH 的路由
- javascript - 摩卡异步钩子没有运行?
- html - favicon to base64 在此站点上失败