首页 > 解决方案 > Numpy:提取连续数字范围的有效方法

问题描述

假设一个人有一个 numpy 浮点数组,其中所有值都在[0,1]例如

arr = np.array([
    [0.1,  0.1,  0.1,  0.4,  0.91, 0.81, 0.84], # channel 1 
    [0.81, 0.79, 0.85, 0.1,  0.2,  0.61, 0.91], # channel 2 
    [0.3,  0.1,  0.24, 0.87, 0.62, 1,    0   ], # channel 3
    #...
])

并且想要将其转换为二进制数组。这可以通过以下方式轻松完成:

def binary_mask(arr, cutoff=0.5):
  return (arr > cutoff).astype(int)

m = binary_mask(arr)

# array([[0, 0, 0, 0, 1, 1, 1],
#    [1, 1, 1, 0, 0, 1, 1],
#    [0, 0, 0, 1, 1, 1, 0]])

1可以通过以下方式获取s 的所有索引

for channel in m:
  print(channel.nonzero())

# (array([4, 5, 6]),)
# (array([0, 1, 2, 5, 6]),)
# (array([3, 4, 5]),)

什么是连续运行数字的有效方式

例如

[ 
    [[4,6]], 
    [[0,2], [5,6]], 
    [[3,5]]
]

一种天真的方法可能是:

def consecutive_integers(arr):

    # indexes of nonzero
    nz = arr.nonzero()[0]

    # storage of "runs"
    runs = []

    # error handle all zero array
    if not len(nz):
        return [[]]

    # run starts with current value
    run_start = nz[0]
    for i in range(len(nz)-1):

        # if run is broken
        if nz[i]+1 != nz[i+1]:
            # store run
            runs.append([run_start, nz[i]])

        # start next run at next number
        run_start = nz[i+1]

    # last run ends with last value
    runs.append([run_start, nz[-1]])

    return runs



print(m)
for runs in [consecutive_integers(c) for c in m]:
    print(runs)



# [[0 0 0 0 1 1 1]
#  [1 1 1 0 0 1 1]
#  [0 0 0 1 1 1 0]]
# 
# [[4, 6]]
# [[0, 2], [5, 6]]
# [[3, 5]]

标签: pythonnumpy

解决方案


我会看看这个答案:https ://stackoverflow.com/a/7353335/1141389

它用np.split


推荐阅读