首页 > 解决方案 > 如何优雅地构建伯努利分布置信度的蒙特卡罗模拟

问题描述

作业问题是,使用 n、p 的列表,对于 n 和 p 的每个组合,使用蒙特卡罗模拟来估计每种生成区间的方法达到的实际置信水平,并检查每种方法是否达到了标称置信水平n 和 p 的值网格

所以对于每一对n,p,我必须生成N个复制来实现所谓的蒙特卡罗模拟。

下面是我的尝试,

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from scipy import stats as st
def ci_prop(
    x,
    level=0.95,
    method="Normal"
):
    """
    Input:
    x : A 1-dimensional NumPy array or compatible sequence type (list, tuple).

    confidence level : float, optional.

    method: str, optional

    Output
    -------
    lwr
    upr
    width

    """
    # check input type
    try:
        x = np.asarray(x)  # or np.array() as instructed.
    except TypeError:
        print("Could not convert x to type ndarray.")

    # check that x is bool or 0/1
    if x.dtype is np.dtype('bool'):
        pass
    elif not np.logical_or(x == 0, x == 1).all():
        raise TypeError("x should be dtype('bool') or all 0's and 1's.")

    # check method
    assert method in ["Normal", "CP", "Jeffrey", "AC"]

    # determine the length
    n = x.size

    # compute estimate
    if method == 'AC':
        z = st.norm.ppf(1 - (1 - level) / 2)
        n = (n + z ** 2)
        est = (np.sum(x) + z ** 2 / 2) / n
    else:
        est = np.mean(x)

    # warn for small sample size with "Normal" method
    if method == 'Normal' and (n * min(est, 1 - est)) < 12:
        warn(Warning(
            "Normal approximation may be incorrect for n * min(p, 1-p) < 12."
        ))

    # compute bounds for Normal and AC methods
    if method in ['Normal', 'AC']:
        se = np.sqrt(est * (1 - est) / n)
        z = st.norm.ppf(1 - (1 - level) / 2)
        lwr, upr = est - z * se, est + z * se
        width = upr - lwr

    # compute bounds for CP method
    if method == 'CP':
        alpha = 1 - level
        s = np.sum(x)
        lwr = beta.ppf(alpha / 2, s, n - s + 1)
        upr = beta.ppf(1 - alpha / 2, s + 1, n - s)
        width = upr - lwr
        
    # compute bounds for Jeffrey method
    if method == 'Jeffrey':
        alpha = 1 - level
        s = np.sum(x)
        lwr = beta.ppf(alpha / 2, s + 0.5, n - s + 0.5)
        upr = beta.ppf(1 - alpha / 2, s + 0.5, n - s + 0.5)
        width = upr - lwr
        
    # prepare return values
    
    return lwr, upr, width

n = np.linspace(40, 60, 6).astype("int")
p = np.linspace(0.7, 0.9, 6)
x = len(n)
y = len(p)
N_rep = 200 # for example, I set N = 200

Data_Set = []
for i in range(x):
    for j in range(y):
        A = np.random.binomial(1, p[j], n[i])
        res = ci_prop(A, 0.8, method = "AC")
        Data_Set.append(res)

Data_Set

我想要结果,对于每种方法(“AC”、“Normal”等),都有一个一维数组,这个数组的每个元素都是 200(例如,让 N = 200)复制的比例包括对应 (n, p) 对的真实均值 (p) 的置信区间

我尝试使用循环但失败了,因为我不知道如何让它生成 200 次。此外,运行嵌套的 for 循环似乎太笨拙且效率低下。我想知道是否有更有效和更优雅的方式来实现这一目标?欢迎任何帮助或提示。

标签: pythonpandasscipystatisticsmontecarlo

解决方案


推荐阅读