python - Python 中的因果影响分析 - P 值似乎不正确
问题描述
我正在用 Python 进行因果影响分析,与对照组(A/B 测试)相比,这有助于衡量干预后治疗组的影响。为了开始使用 Python,我参考了https://github.com/jamalsenouci/causalimpact/blob/master/GettingStarted.ipynb
假设我的数据格式如下:
将 Period_1 视为治疗,将 Period_2 视为控制
以下代码完美运行:
from causalimpact import CausalImpact
pre_period = [pd.to_datetime(date) for date in [start_date,cut_date_1]]
post_period = [pd.to_datetime(date) for date in [cut_date_2,end_date]]
impact = CausalImpact(df_AA.loc[start_date:end_date_AA], pre_period, post_period, model_args={"nseasons":7})
impact.run()
impact.plot()
我得到低于 2 个图表,并且由于预测值的置信区间与实际值重叠,因此运动似乎没有统计学意义
但是,我想最终回答运动是否具有统计显着性以及治疗和控制之间的 p 值是多少?为此我使用
print(impact.summary())
print(impact.summary("report"))
我得到的结果如下。它说 p 值为 0.0 并且有 stat sig 积极的运动。这似乎不正确。我尝试了不同的数据,其中实际和预测的差异非常高,并且它们不是预测的 CI 与实际不重叠,我仍然得到 p 值为 0。似乎计算的 p 值不正确。是否有任何指针可以为这个因果影响库自行计算 p 值,或者是否有办法修复这个库?
Average Cumulative
Actual 15 247
Predicted 15 246
95% CI [15, 15] [244, 249]
Absolute Effect 0 1
95% CI [0, 0] [3, -1]
Relative Effect 0.4% 0.4%
95% CI [1.5%, -0.6%] [1.5%, -0.6%]
P-value 0.0%
Prob. of Causal Effect 100.0%
None
During the post-intervention period, the response variable had an average value of approx. 15. By contrast, in the
absence of an intervention, we would have expected an average response of 15. The 90% interval of this counterfactual
prediction is [15, 15]. Subtracting this prediction from the observed response yields an estimate of the causal effect
the intervention had on the response variable. This effect is 0 with a 90% interval of [0, 0]. For a discussion of the
significance of this effect, see below.
Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully
interpreted), the response variable had an overall value of 247. By contrast, had the intervention not taken place, we
would have expected a sum of 247. The 90% interval of this prediction is [244, 249]
The above results are given in terms of absolute numbers. In relative terms, the response variable showed an increase
of 0.4%. The 90% interval of this percentage is [1.5%, -0.6%]
This means that the positive effect observed during the intervention period is statistically significant and unlikely
to be due to random fluctuations. It should be noted, however, that the question of whether this increase also bears
substantive significance can only be answered by comparing the absolute effect 0 to the original goal of the underlying
intervention.
None
解决方案
推荐阅读
- c# - System.FormatException if use new CultureInfo("id-ID") - 印度尼西亚文化
- sql-server - 视图不允许空值
- ldap - DirectoryEntry 实例未在 LDAP 中配置
- laravel - 获取有多少用户查看了 Android 应用程序的通知 laravel API
- python - 绘制列表 matplotlib 的每个分类值
- python - 如何在 Django 中为每个所有者(外键)增加“数字”字段
- verification - 为什么我在 promela 中会出现这样的错误?
- php - PHP - 空合并运算符
- schedule - 如何将 Autosys 扩展日历配置为在每个季度的前 14 天运行
- python - 使用 python 的 Cloud Firestore