首页 > 解决方案 > 熊猫删除每个组中每个元素在列上具有相同值的组

问题描述

我需要为 a 做groupbya df,然后在每个组中,我想检查该组中的每个元素是否在列上具有相同的值A,如果是,则删除该组,

 df['cluster_id'] = df.groupby(['B', 'C', 'D'])['B'].transform('size')

 df = df.loc[
        df['cluster_id'] > 1 &
        df['cluster_id'] == df['cluster_id'] &
        df['A'] != df['A']]

但我得到了错误

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我想知道如何解决它。

标签: python-3.xpandasdataframepandas-groupby

解决方案


我猜()是失踪了:

df =df[(df['cluster_id'] > 1) & (df['cluster_id'] == df['cluster_id']) & (df['A'] != df['A'])]

此外,似乎不需要第二个条件:

df = df[(df['cluster_id'] > 1) & (df['A'] != df['A'])]

也不需要新列,可以通过以下方式进行比较Series

cluster_id = df.groupby(['B', 'C', 'D'])['B'].transform('size')

df = df[(cluster_id > 1) & (cluster_id == cluster_id) & (df['A'] != df['A'])]

df = df[(cluster_id > 1) & (df['A'] != df['A'])]

推荐阅读