首页 > 解决方案 > 如何从 numpy 子数组中删除 NaN

问题描述

我有以下 numpy 数组:

array([['0.0', '0.0'],
       ['3.0', '0.0'],
       ['3.5', '35000.0'],
       ['4.0', '70000.0'],
       ['4.2', 'nan'],
       ['4.5', '117000.0'],
       ['5.0', '165000.0'],
       ['5.2', 'nan'],
       ['5.5', '225000.0'],
       ['6.0', '285000.0'],
       ['6.2', 'nan'],
       ['6.5', '372000.0'],
       ['7.0', '459000.0'],
       ['7.5', '580000.0'],
       ['8.0', '701000.0'],
       ['8.1', 'nan'],
       ['8.5', '832000.0'],
       ['8.8', 'nan'],
       ['9.0', '964000.0'],
       ['9.5', '1127000.0'],
       ['33.0', 'nan'],
       ['35.0', 'nan']], dtype='<U12')

我想删除所有具有 nan 值的子数组。

期望的输出是:

array([['0.0', '0.0'],
       ['3.0', '0.0'],
       ['3.5', '35000.0'],
       ['4.0', '70000.0'],
       ['4.5', '117000.0'],
       ['5.0', '165000.0'],
       ['5.5', '225000.0'],
       ['6.0', '285000.0'],
       ['6.5', '372000.0'],
       ['7.0', '459000.0'],
       ['7.5', '580000.0'],
       ['8.0', '701000.0'],
       ['8.5', '832000.0'],
       ['9.0', '964000.0'],
       ['9.5', '1127000.0'], dtype='<U12')

我以尝试结束np.isnan(array),但我得到了错误ufunc 'isnan' not supported for the input types。编写此代码时的一个想法是将数组拆分为两个数组并获取 nan 索引并在两个数组上应用过滤器并合并回来。任何帮助表示赞赏。

标签: pythonnumpy

解决方案


TL;博士

a = a.astype(float); filtered = a[~np.isnan(a[:, 1])]

假设您希望您的 numpy 数组作为浮点数而不是字符串:

import numpy as np


# generate similar data
a = np.random.randint(low=0, high=20, size=(5, 2)).astype(str)
a[[0, 2, 3], 1] = 'nan'
print(a)
# [['15' 'nan']
#  ['17' '9']
#  ['15' 'nan']
#  ['5' 'nan']
#  ['14' '14']]

# convert to float first
a = a.astype(float)

# filter by np.nan
filtered = a[~np.isnan(a[:, 1])]

print(filtered)
# [[17.  9.]
#  [14. 14.]]

推荐阅读