首页 > 解决方案 > 我该如何解决这个错误。它看起来像一个类型错误,但我很确定这些数组属于同一类型

问题描述

如何修复此错误?

UFuncTypeError: ufunc 'equal' did not contain a loop with signature matching types (dtype('<U1'), dtype('<U1')) -> dtype('bool')

这是代码:

print(y[0:4], len(y))
print(outs[0:4], len(outs))
corrects = np.equal(outs.astype(str), y.astype(str), casting="safe")

outs并且y都是相同长度和类型的numpy数组。打印输出为:

['1' '1' '1' '1'] 140
['1' '1' '1' '1'] 140

看起来这两个数组是同一类型的,所以我真的很难过。

标签: pythonnumpy

解决方案


但是 '==' 确实有效:

In [22]: x=np.array(['1','1'])
In [23]: x
Out[23]: array(['1', '1'], dtype='<U1')
In [24]: x==x
Out[24]: array([ True,  True])
In [25]: np.equal(x,x)
Traceback (most recent call last):
  File "<ipython-input-25-ab75613b91f2>", line 1, in <module>
    np.equal(x,x)
UFuncTypeError: ufunc 'equal' did not contain a loop with signature matching types (dtype('<U1'), dtype('<U1')) -> dtype('bool')

np.equal说它与'=='相同,但我确信解释器会将其翻译为:

In [29]: x.__eq__(x)
Out[29]: array([ True,  True])

np.equal是一个ufunc添加了关键字和方法(如reduce),以及 . 的概念signature。我仍然不确定为什么它会因字符串而失败。

字符串可能会__eq__委托给np.char.equal(如另一个答案中所述),但我不确定如何测试它。可能时间上有所不同。这些char函数通常以列表理解速度运行。

In [35]: np.char.equal??
Signature: np.char.equal(x1, x2)
Source:   
@array_function_dispatch(_binary_op_dispatcher)
def equal(x1, x2):
    """
    Return (x1 == x2) element-wise.

    Unlike `numpy.equal`, this comparison is performed by first
    stripping whitespace characters from the end of the string.  This
    behavior is provided for backward-compatibility with numarray.

时间测试

制作一个大字符串数组,并匹配整数一个:

In [36]: X = np.repeat(x, 100000)
In [37]: X.shape
Out[37]: (200000,)
In [39]: A = np.arange(x.shape[0])
In [40]: timeit A==A
636 ns ± 5.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [41]: timeit X==X
2.07 ms ± 359 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

所以字符串比较肯定更慢,虽然np.char.equal更慢。

In [42]: timeit np.char.equal(X,X)
5.09 ms ± 1.53 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

和字符串时间:

In [43]: Xl = X.tolist()
In [44]: timeit [i==j for i,j in zip(Xl, Xl)]
13.6 ms ± 27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

推荐阅读