首页 > 解决方案 > 获取 JaggedArray 的直方图

问题描述

嗨,我有一个结构有点复杂的 ROOT TTree。当我使用 uproot 创建数组时:

analysis = uproot.open("/b/LJ_data/02Oct2019/FRVZ/FRVZprompt2zd_mH125_mzd01.root")["analysis"]
el_Eratio = analysis.arrays(["el_Eratio"], cache=mycache);
print(el_Eratio)

我得到一个锯齿状数组:

{b'el_Eratio': <JaggedArray [[0.9679527 0.8814101 0.88584787] [0.34557977 0.22699767 0.9040524 0.0] [] ... [0.94681776] [0.91043043 0.621741 0.85297334 0.9364375] [0.83885396]] at 0x7f39cf32dfd0>}

我正在尝试创建此数据的简单直方图:

n, bins, patches = plt.hist(el_Eratio, 100, density = True)

但我收到错误:

Traceback (most recent call last):
  File "macro.py", line 45, in <module>
    n, bins, patches = plt.hist(el_Eratio, 100, density = True)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2636, in hist
    **({"data": data} if data is not None else {}), **kwargs)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/__init__.py", line 1589, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 6721, in hist
    xmin = min(xmin, np.nanmin(xi))
TypeError: '<' not supported between instances of 'dict' and 'float'

我是否需要将锯齿状阵列重新格式化为列表或普通阵列?如果是这样,我该怎么做?

还是我只是错误地调用了数组?我也试过:

n, bins, patches = plt.hist(el_Eratio.values(), 100, density = True)

但我得到一个类似的错误:

Traceback (most recent call last):
  File "macro.py", line 45, in <module>
    n, bins, patches = plt.hist(el_Eratio.values(), 100, density = True)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2636, in hist
    **({"data": data} if data is not None else {}), **kwargs)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/__init__.py", line 1589, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 6721, in hist
    xmin = min(xmin, np.nanmin(xi))
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/numpy/lib/nanfunctions.py", line 298, in nanmin
    res = np.amin(a, axis=axis, out=out, **kwargs)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2618, in amin
    initial=initial)
  File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: operands could not be broadcast together with shapes (3,) (2,) 

Disclamer:虽然我有使用 root 和 C++ 的经验,但我是 python 新手。

谢谢,莎拉

标签: pythonnumpymatplotlibuproot

解决方案


因为你说analysis.arrays(复数),所以你得到了一个 Python 字典。它包含的唯一数组(因为您只要求一个,["el_Eratio"])有一个键:b"el_Eratio". 请注意,这是一个字节串(以 开头b)。如果您知道编码,例如"utf-8",则可以传递namedecode="utf-8"给该arrays方法以获取纯字符串。

提取 JaggedArray 后,您仍然需要将其转换为平面数组,以便直方图函数知道如何处理它:

plt.hist(el_Eratio[b'el_Eratio'].flatten())

具体来说,您是说您想要锯齿状数组的嵌套内容的直方图,而不是其他东西,例如

plt.hist(el_Eratio[b'el_Eratio'].counts)

每个内部数组中的值的数量。这种数据集具有更多的结构,因此您需要在绘制一维数字包之前决定如何处理该结构。


推荐阅读