首页 > 解决方案 > 获取要删除的行的索引,其中两列为零

问题描述

这是我的专栏:

'CD Block_Code','Total Population Female','Illiterate Female','Total/Rural/Urban'

我想删除女性总人口为零或文盲女性为零的行。

代码

df_cleaned = df.copy(deep = True)

entry_to_remove = [] ;

for index, col in  df.iterrows():

    if (col['Total Population Female'] == '0') or col['Illiterate Female'] == '0':      
        entry_to_remove.append(index)   

    print("entry_to_remove: {}".format(len(entry_to_remove)))

df_cleaned.drop(entry_to_remove, axis = 0, inplace = True)

df_cleaned.head(3)

当我运行最后一个代码时,它给了我零行,实际上只有 634 个是零。

所以会有4个集群,我想分别获取所有4个集群的数据并做进一步分析。

标签: pythondataframefilter

解决方案


一个更简单的方法是使用索引,使用 2 个条件:

df[(df['Illiterate Female']!=0) & (df['Total Population Female']!=0)]

例子:

>>> df
   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1
2              0                  1                        0
3              0                  0                        1

>>> df[(df['Illiterate Female']!=0) & (df['Total Population Female']!=0)]
   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1

您还可以根据底层numpy数组进行过滤,这对于大型数据帧可能更快,但不可否认的是可读性较差:

df[(df[['Illiterate Female','Total Population Female']].values != 0).all(1)]

   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1

推荐阅读