首页 > 解决方案 > 当值是列表时,在 Pandas 中获取唯一值

问题描述

我有一个 DF,其中一列包含一系列列表。

In [64]: df[~df['packet/net/sourceRoute'].isnull()]['packet/net/sourceRoute']                       
Out[64]: 
2177              [fd00::2]
2178              [fd00::2]
2182              [fd00::2]
3860     [fd00::2, fd00::3]
3861     [fd00::2, fd00::3]
                ...        
21329             [fd00::8]
21331    [fd00::7, fd00::8]
21354             [fd00::8]
21355             [fd00::8]
21358             [fd00::8]
Name: packet/net/sourceRoute, Length: 105, dtype: object

我想获取该列的值packet/net/sourceRoute。但是,如果我确实应用该unique()方法,我会收到此错误。

In [70]: df['packet/net/sourceRoute'].unique()                                                      
---------------------------------------------------------------------------

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.unique()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable._unique()

TypeError: unhashable type: 'list'

即使我尝试删除重复项,我也做不到。

In [73]: df[~df['packet/net/sourceRoute'].isnull()]['packet/net/sourceRoute'].drop_duplicates()     
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: unhashable type: 'list'

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-73-fef53b94129b> in <module>
----> 1 df[~df['packet/net/sourceRoute'].isnull()]['packet/net/sourceRoute'].drop_duplicates()


SystemError: <built-in function duplicated_object> returned a result with an error set

有任何想法吗?

谢谢!

标签: pythonpandasduplicatesunique

解决方案


推荐阅读