python - PyMC3 中的离散生存函数(自定义似然度)
问题描述
我正在尝试跟随营销教程。本教程使用Frequentist/MLE 方法;我喜欢 PyMC3 并决定使用它。教程的作者使用了一个Survival函数,就是
S(t|churn_rate) = (1-churn_rate)**(t-1)
这与几何分布形成对比,几何分布只是在上面增加了一项:S(t|churn_rate) = churn_rate*(1-churn_rate)**(t-1)
.
PyMC3 内置了几何分布,所以我的问题不存在。而是找到一种将生存函数写为可能性的方法。
import arviz as az
import pymc3 as pm
import numpy as np
from pipe import traverse
wins = [1000, 631, 468, 382, 326]
geo = [[idx+1 for i in range(n)] for idx,n in enumerate(wins)]
geo = np.array(list(geo | traverse)) #flattens the array
with pm.Model() as model:
beta_alpha = pm.Uniform('beta_alpha', 0.0001, 5)
beta_beta = pm.Uniform('beta_beta', 0.0001, 5)
churn = pm.Beta('churn',
alpha=beta_alpha,
beta=beta_beta)
renewal = pm.Deterministic('renewal', 1-churn)
def log_likelihood(theta, t):
return (t-1)*np.log(theta)
lik = pm.Potential('like', log_likelihood(theta=renewal, t=geo))
trace = pm.sample(chains=4)
不幸的是,采样器已经失控了......
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (4 chains in 1 job)
NUTS: [churn, beta_beta, beta_alpha]
100.00% [2000/2000 00:02<00:00 Sampling chain 0, 821 divergences]
100.00% [2000/2000 00:03<00:00 Sampling chain 1, 562 divergences]
100.00% [2000/2000 00:02<00:00 Sampling chain 2, 628 divergences]
100.00% [2000/2000 00:03<00:00 Sampling chain 3, 364 divergences]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 12 seconds.
There were 822 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.498263496658037, but should be close to 0.8. Try to increase the number of tuning steps.
There were 1385 divergences after tuning. Increase `target_accept` or reparameterize.
There were 2013 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6553072990104106, but should be close to 0.8. Try to increase the number of tuning steps.
There were 2378 divergences after tuning. Increase `target_accept` or reparameterize.
The estimated number of effective samples is smaller than 200 for some parameters.
我之前只是写了一个likelihood
函数,而不是log_likelihood
函数,但是采样器对此也不满意。
我的几个怀疑:
- 目前尚不清楚是否
pm.Potential
或pm.DensityDist
。SO 社区似乎认为这pm.Potential
是一个更好的选择。 - 我将一个名为 geo 的数组传递给
log_likelihood
. 也许它期待一个标量并且不太确定数组的构成......
资料来源:
解决方案
推荐阅读
- python - 当日期可以输入为 04-14-2021 和 04-2022 和 4-2021 时,在 spacy 中指定正则表达式
- woocommerce - 根据 Woocommerce 中的购物车项目排除某些优惠券代码
- c# - .NET Web API,子模块项目未在克隆时加载
- c - 无法从 C 中的主函数读取 argv[2]
- cordova - FAILURE:构建失败,出现异常 vuejs 和 cordova
- ambari - 如何清除ambari中的命令任务状态异常?
- salesforce - Fullcalendar 部分显示日历上的事件
- c# - 在 C# 中按大小将视频拆分为较小的部分
- javascript - 如何更改显示“
" 循环的标签 - c# - VisualStudio 2019:更新到版本 16.11.1 后的服务器资源管理器问题