首页 > 解决方案 > Use the scipy.stats.kstest to see if the randomly generated numbers follow a specified distribution

问题描述

I am trying to generate random numbers from a chosen distribution with specified parameters and then see if the numbers are indeed following that distribution using the Kolmogorov-Smirnov test.

import matplotlib.pyplot as plt
from scipy.stats import johnsonsu
values = johnsonsu.rvs(0.4, 1.27, loc = 3.50, scale = 5.97, size = 10000)
plt.hist(values, bins = 25)
plt.show()

enter image description here

Dstat, Pvalue = scipy.stats.kstest(values, 'johnsonsu', args = (0.4, 1.27))
print(Dstat)
0.48575579351993264
print(Pvalue)
0.0

I believe that the null-hypothesis of the KS test is that the sample data follows the specified distribution (johnson su, in this example). So the p-value being less than 0.05 rejects the null hypothesis and we conclude that the data is not following the distribution? Shouldn't it be the opposite or am I missing something?

标签: pythonscipy

解决方案


If I am passing the full list of distribution parameters to the arg parameter I get what you expect, namely:

import scipy.stats as stats

n = 10_000
values = stats.johnsonsu.rvs(0.4, 1.27, loc=3.50, scale=5.97, size=n)

print(stats.kstest(values, 'johnsonsu', N=n, args=(0.4, 1.27, 3.5, 5.97)))
KstestResult(statistic=0.007110068990121343, pvalue=0.6928424801510613)

推荐阅读