首页 > 解决方案 > 有效地查找非零RGB像素python

问题描述

我有 NumPy ndarray(width, height, rgb) 格式的 RGB 图像(即 (1370, 5120, 3) )。我想找到一种有效的方法来获取非零 rgb 值像素的 x、y 坐标。我没有使用 for 循环来迭代每个像素,而是使用诸如anynonzero可能的 numpy 方法来寻找矢量化实现(因为我有超过 30k 的图像)。有人可以通过提供一些可能的实现示例来帮助我吗?

标签: pythonnumpy

解决方案


Looking for black pixel is same as looking for pixels with r + g + b == 0.

import numpy as np

# Make fake data for testing
# 0th axis: x
# 1st axis: y
# 2nd axis: r, g, and b

size = (1370, 5120, 3)
# allocate new memory for the array (slow)
# fill the new array with random integer (vectorized c loop, medium speed)
img = np.random.randint(0, 255, size=size, dtype=np.uint8)

# allocate new memory for the s array         (slow)
s = np.sum(img, axis=2)    # sum              (vectorized c loop, fastest)
                           # sum on last axis is fastest because that adds
                           # up numbers next to each other in the memory.
# allocate new memory for the non_black array (slow)
non_black = (s != 0)       # comparison       (vectorized c loop, fast)
# allocate new memory for the x, y arrays     (slow)
x, y = np.where(non_black) # scan & convert   (c loop, slower)

print(x, y)

Allocating memory for the img array is slow. But loading the img from hard disk is horribly slow compared to allocation of the memory.

If you don't do a lot of computation on the images, then compression of the image / fast solid state hard drive would be a major factor in speed of the process. If computation and IO take roughly same amount of time, use one thread to load the data from hard disk while another thread does the computation.

Direct transformation of the img array is often faster than working with the coordinates if you have enough memory for holding the img array. Transforming the img in-place is even faster because that avoids allocating new memory for the result.

Coordinates use less memory if the img contains lots of black pixels. That might be fine if you are working with black/white image. Try sparse matrix if you have lots of black pixels and memory is an issue.

Comparison is slower than sum. Probably can replace the comparison with a conversion from int to boolean.

The np.all(image_arr != [0, 0, 0], axis=-1) method requires 3 comparisons and 1 logical_and per pixel and generates two intermediate arrays that are as large as the image. The method here needs 1 sum and 1 comparison per pixel and generates two intermediate arrays that are smaller than the image.

But all these optimization of the computation is pointless if the I/O takes eternity compared to the computation time.


推荐阅读