首页 > 解决方案 > How to split array by indices where the splitted sub-arrays include the split point

问题描述

I have a 2D array containing values and a 1D array with index values where I would like to split the 2D matrix, where the splitted sub-arrays include the 'split-point'.

I know I can use the numpy.split function to split by indices and I know I can use stride_tricks to split an array for creating consecutive overlapping subset-views.

But it seems the stride_ticks only applies if we want to split an array into equal sized sub-arrays.

Minimal example, I can do the following:

>>> import numpy as np
>>> array = np.random.randint(0,10, (10,2))
>>> indices = np.array([2,3,8])
>>> array
array([[8, 1],
       [1, 0],
       [2, 0],
       [8, 8],
       [1, 6],
       [7, 8],
       [4, 4],
       [9, 4],
       [6, 7],
       [6, 4]])

>>> split_array = np.split(array, indices, axis=0)
>>> split_array
[array([[8, 1],
        [1, 0]]), 

 array([[2, 0]]), 

 array([[8, 8],
        [1, 6],
        [7, 8],
        [4, 4],
        [9, 4]]), 

 array([[6, 7],
        [6, 4]])]

But I'm merely looking for an option within the split function where I could define include_split_point=True, which would give me a result as such:

[array([[8, 1],
        [1, 0],
        [2, 0]]), 

 array([[2, 0],
        [8, 8]]), 

 array([[8, 8],
        [1, 6],
        [7, 8],
        [4, 4],
        [9, 4],
        [6, 7]]), 

 array([[6, 7],
        [6, 4]])]

标签: pythonarraysnumpysplitstride

解决方案


创建一个重复索引元素的新数组

new_indices = np.zeros(array.shape[0], dtype = int)
new_indices[indices] = 1
new_indices += 1
new_array = np.repeat(array, new_indices, axis = 0)

更新索引以考虑更改的数组

indices = indices + np.arange(1, len(indices)+1)

像往常一样使用索引进行拆分

np.split(new_array, indices, axis = 0)

输出:

[array([[8, 1],
        [1, 0],
        [2, 0]]), 
 array([[2, 0],
        [8, 8]]), 
 array([[8, 8],
        [1, 6],
        [7, 8],
        [4, 4],
        [9, 4],
        [6, 7]]), 
 array([[6, 7],
        [6, 4]])]

推荐阅读