python - 如何创建一个函数来测试每个变量的正态性
问题描述
我正在尝试构建一个迭代返回 i)JarqueBera 测试统计,ii)JarqueBera pvalue,iii)probplot 的斜率,截距和确定系数的函数,以及 iv)probplot 本身。All 旨在一次返回单个变量。
def normality(c):
JB_test_stat = ss.jarque_bera(c)[0]
JB_pval = ss.jarque_bera(c)[1]
probplot_slope = ss.probplot(c, plot = plt)[1][0]
probplot_interc = ss.probplot(c, plot = plt)[1][1]
probplot_r = ss.probplot(c, plot = plt)[1][2]
return(print("Skewness:",c.skew(),"\nExcess kurtosis:",c.kurt(),"\nJarque-Bera stat:",JB_test_stat," pvalue:", JB_pval,"\nSlope:",probplot_slope,"Intercept:",probplot_interc, "r:",probplot_r,"\n"))
不幸的是,当我在我的数据框 [numeric_cols] 上调用该函数时,作为 numeric_cols 列表,
for c in numeric_cols:
normality(df[c])
我在 return 语句中正确地得到了所有的数字结果,但是在底部一个单一的概率图,所有变量都以凌乱的方式绘制,而我期望得到每个变量的数字结果及其对应的概率图。
Skewness: 0.1004187952160102 Excess kurtosis: -0.543819517693596 Jarque-Bera stat: 7.593972235734294 pvalue: 0.022438296430201454 Slope: 4.3135147782152465 Intercept: 25.5 r: 0.9947611456706487
Skewness: -0.1560130144763728 Excess kurtosis: -1.2824901951466612 Jarque-Bera stat: 38.56183464454786 pvalue: 4.23061985443951e-09 Slope: 11.492550446207257 Intercept: 19.535714285714285 r: 0.9668502992894236
Skewness: 0.2347601433103727 Excess kurtosis: -1.242639192300385 Jarque-Bera stat: 39.0662449724179 pvalue: 3.287552452491127e-09 Slope: 11.545683807955731 Intercept: 15.714285714285714 r: 0.9647448407831439
Skewness: 0.24353437856100904 Excess kurtosis: -1.1969521906230485 Jarque-Bera stat: 36.98912338336009 pvalue: 9.287822622106034e-09 Slope: 1013.985374629207 Intercept: 1411.4436090225563 r: 0.9682492605786011
偏度:2.837876986150242 过度峰度:9.516628330654008 Jarque-Bera 统计:2675.4455000782764 p值:0.0 斜率:2.6057664781688454 截距:1.85338345864660578 508647605778
偏度:2.406153102778617 超额峰度:7.002529753885085 Jarque-Bera 统计:1573.6596724989513 pvalue:0.0 斜率:1.714847443415902 截距:1.287593984962490614181814 r:
偏度:0.9337529310147361 过度峰度:0.45862734243889847 Jarque-Bera 统计:81.22389376608798 pvalue:0.0 斜率:605.3354149443196 截距:717.75 r:0.9895040
偏度:-3.030640857636996 过度峰度:15.686541621050898 Jarque-Bera 统计:6154.761075129672 pvalue:0.0 斜率:11.37955609488042 截距:77.82387251904514056 r:
Skewness: 6.398317104228115 Excess kurtosis: 49.10097819497357 Jarque-Bera stat: 56029.69126113364 pvalue: 0.0 Slope: 0.41431397013222515 Intercept: 0.1917293233082707 r: 0.48503363895959983
Skewness: 6.204252341215679 Excess kurtosis: 47.28662289867727 Jarque-Bera stat: 52010.755388690835 pvalue: 0.0 Slope: 0.4947086253584861 Intercept: 0.23496240601503762 r: 0.5050004904368586
Skewness:2.06633193738682多余的Kurtosis:5.770784034742405 Jarque-Bera Stat:1098.0175308306793 PVALUE:0.0 SLOPE:0.0
Skewness:2.9189857433086495多余的Kurtosis:16.837230233306762 Jarque-Bera Stat:6909.724155123523 PVALUE:0.0 SLOPE:0.0
偏度:1.2633082232077495 过度峰度:1.5265390704578943 Jarque-Bera 统计:190.6495836394772 p值:0.0 斜率:2.09821120102269 截距:2.114661650735138282165473513821192
Skewness: 3.091346622737553 Excess kurtosis: 8.530683362863476 Jarque-Bera stat: 2421.371001114453 pvalue: 0.0 Slope: 0.16657862407594715 Intercept: 0.09022556390977444 r: 0.5658043763386988
怎么可能修好?谢谢大家
解决方案
只需plt.figure()
在您的函数中添加一个,这样每次调用该函数都会打开一个新图形。
完全不同的是,使用return(print('stuff'))
是多余的。如果您真的想打印结果,那么只需使用print
with no return
。
返回您当前正在打印的值,然后在外部打印它们会更pythonic并且通常更好的做法:
def normality(c):
JB_test_stat = ss.jarque_bera(c)[0]
JB_pval = ss.jarque_bera(c)[1]
probplot_slope = ss.probplot(c, plot = plt)[1][0]
probplot_interc = ss.probplot(c, plot = plt)[1][1]
probplot_r = ss.probplot(c, plot = plt)[1][2]
return c.skew(), c.kurt(), JB_test_stat, JB_pval, probplot_slope, probplot_interc, probplot_r
for c in numeric_cols:
c.skew(), c.kurt(), JB_test_stat, JB_pval, probplot_slope, probplot_interc, probplot_r = normality(df[c])
print("Skewness:",c.skew(),
"\nExcess kurtosis:",c.kurt(),
"\nJarque-Bera stat:",JB_test_stat,
" pvalue:", B_pval,
"\nSlope:",probplot_slope,
"Intercept:",probplot_interc,
"r:",probplot_r,"\n")
推荐阅读
- c++ - 在全屏时对视频执行 QtCreator 操作
- python - Python 导入:尝试相对导入
- java - Hibernate Validator - 动态验证-消息
- javascript - 如何将自定义 Angular 6 库正确更新到 Angular 9?
- python - 添加文件夹名称作为文件名的前缀
- python - Matplotlib:如何为已经对数转换的数据设置自定义日志标记和位置
- linux - Kubernetes 上的 Rook 和 ceph
- bash - Bash 脚本通过归档和压缩 (.tar.gz) 指定时间之前的所有文件来清理指定目录
- javascript - 如何为 webpack 配置中的每个入口点指定不同的路径和文件名?(尽管遵循了文档,但我的配置无效)
- aws-amplify - AWS Appsync 与阿波罗客户端?