首页 > 解决方案 > scipy超稀疏矩阵乘法超级慢

问题描述

SO上有一些讨论稀疏矩阵乘法性能的帖子,但他们似乎没有在这里回答我的问题。

这是基准代码,

# First, construct a space matrix
In [1]: from scipy.sparse import dok_matrix
In [2]: M = dok_matrix((100, (1<<32)-1), dtype=np.float32)
In [3]: rows, cols = M.shape
In [5]: js = np.random.randint(0, (1<<32)-1, size=100)
In [6]: for i in range(rows):
   ...:     for j in js:
   ...:         M[i,j] = 1.0
   ...:

# Check out sparsity
In [7]: M.shape
Out[7]: (100, 4294967295)
In [8]: M.count_nonzero()
Out[8]: 10000

# Test csr dot performance, 36.3 seconds
In [9]: csr = M.tocsr()
In [10]: %time csr.dot(csr.T)
CPU times: user 36.3 s, sys: 1min 1s, total: 1min 37s
Wall time: 1min 46s
Out[10]:
<100x100 sparse matrix of type '<class 'numpy.float32'>'
    with 10000 stored elements in Compressed Sparse Row format>

以上csr.dot花费了 36.3 秒,恕我直言,这相当长。为了加快速度,我编写了一个简单的for循环点函数,如下所示,

def lil_matmul_transposeB(A, B):
    rows_a, cols_a = A.shape
    rows_b, cols_b = B.shape
    assert cols_a == cols_b
    C = np.zeros((rows_a, rows_b))

    for ra in range(rows_a):
        cols_a = A.rows[ra]
        data_a = A.data[ra]
        for i, ca in enumerate(cols_a):
            xa = data_a[i]
            for rb in range(rows_b):
                cols_b = B.rows[rb]
                data_b = B.data[rb]
                pos = bs(cols_b, ca)
                if pos!=-1:
                    C[ra,rb] += data_b[pos] * xa
    return C

# Test dot performance in LiL format,
In [25]: lil = M.tolil()
In [26]: %time A = F.lil_matmul_transposeB(lil, lil)
CPU times: user 1.26 s, sys: 2.07 ms, total: 1.26 s
Wall time: 1.26 s

上述功能仅需 1.26s,比内置的csr.dot.

所以我想知道我在这里做稀疏矩阵乘法是否犯了一些错误?

标签: numpysparse-matrix

解决方案


即使稀疏性很小,那个非常大的第二维也会产生问题。

In [12]: Mr = M.tocsr()
In [20]: Mr
Out[20]: 
<100x4294967295 sparse matrix of type '<class 'numpy.float32'>'
    with 10000 stored elements in Compressed Sparse Row format>

Transpose 只是将csrinto 转换为csc,而不更改数组。indptr两者都只是(101,)。

In [21]: Mr.T
Out[21]: 
<4294967295x100 sparse matrix of type '<class 'numpy.float32'>'
    with 10000 stored elements in Compressed Sparse Column format>

但是当我这样做Mr@Mr.T时,当它尝试将其转换Mr.T为 `csr. 也就是说,乘法需要相同的格式:

In [22]: Mr.T.tocsr()
Traceback (most recent call last):
  File "<ipython-input-22-a376906f557e>", line 1, in <module>
    Mr.T.tocsr()
  File "/usr/local/lib/python3.8/dist-packages/scipy/sparse/csc.py", line 138, in tocsr
    indptr = np.empty(M + 1, dtype=idx_dtype)
MemoryError: Unable to allocate 32.0 GiB for an array with shape (4294967296,) and data type int64

它试图制作一个长度为indptr(4294967296,) 的矩阵。在我有限的 RAM 机器上产生错误。在你的身上,它一定是遇到某种内存管理/交换任务,这会减慢它的速度。

因此,即使nnz它很小,也是极端维度让这个速度变慢。


推荐阅读