首页 > 解决方案 > 与 numpy ndarray 的平均绝对偏差

问题描述

我使用一个 4D numpy 数组,在其中计算mean, meadin, std沿数组第 3 维的统计信息,如下所示:

import numpy as np
input_shape = (1, 10, 4)
n_sample =20
X = np.random.uniform(0,1, (n_sample,)+input_shape)
X.shape
(20, 1, 10, 4)

然后我以这种方式计算mean, med,std-dev

sta_fuc = (np.mean, np.median, np.std)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

以便:

stat.shape
(20, 1, 3, 4)

表示mean, medianstd沿该维度的值。

但是我想添加列的平均绝对偏差的值,mad以便统计信息为 ( mean, median, std, mad),但它似乎numpy没有为此提供函数。如何添加mad到我的统计数据?

编辑

至于第一个答案,使用定义的函数,即:

def mad(arr, axis=None, keepdims=True):
    median = np.median(arr, axis=axis, keepdims=True)
    mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
                    axis=axis, keepdims=keepdims)
    return mad

然后添加mad到统计信息中,这会产生错误,如下所示:

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-22-dab51665f952> in <module>()
      1 sta_fuc = (np.mean, np.median, np.std, mad)
----> 2 stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

1 frames

<ipython-input-21-84d735c8c516> in mad(arr, axis, keepdims)
      1 def mad(arr, axis=None, keepdims=True):
      2     median = np.median(arr, axis=axis, keepdims=True)
----> 3     mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
      4                     axis=axis, keepdims=keepdims)
      5     return mad

TypeError: 'axis' is an invalid keyword to ufunc 'absolute'

编辑-2

使用scipy@Jussi 建议的函数也会产生如下错误: from scipy.stats import median_absolute_deviation as mad

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

TypeError: median_absolute_deviation() got an unexpected keyword argument 'keepdims'

标签: pythonnumpymultidimensional-arraynumpy-ndarray

解决方案


我不知道使用 numpy 的内置解决方案。但是你可以很容易地基于 numpy 函数来实现它,使用mad = median(abs(a - median(a))).

def mad(arr, axis=None, keepdims=True):
    median = np.median(arr, axis=axis, keepdims=True)
    mad = np.median(np.abs(arr-median),axis=axis, keepdims=keepdims)
    return mad

推荐阅读