首页 > 解决方案 > 只保留来自熊猫组的字典中没有 None 值的键

问题描述

>>> df = pd.DataFrame({'a': [1,1,1,2,2,3,3,3,3,4,4,5,5], 
'b': [0,1,1,0,1,0,0,1,4,1,0,3,0], 
'v': [2,4,3,7,6,5,9,3,2,4,5,2,3]})
>>> df
    a  b  v
0   1  0  2
1   1  1  4
2   1  1  3
3   2  0  7
4   2  1  6
5   3  0  5
6   3  0  9
7   3  1  3
8   3  4  2
9   4  1  4
10  4  0  5
11  5  3  2
12  5  0  3

>>> df.groupby(by =['a', 'b']).v.apply(list).unstack().to_dict('index')
{1: {0: [2], 1: [4, 3], 3: None, 4: None}, 2: {0: [7], 1: [6], 3: None, 4: 
None}, 3: {0: [5, 9], 1: [3], 3: None, 4: [2]}, 4: {0: [5], 1: [4], 3: None, 4: 
None}, 5: {0: [3], 1: None, 3: [2], 4: None}}

如何在输出字典中避免使用 None 值的键?在目前的情况下,我的字典最终比仅使用所需的键大 20 倍。

标签: pythonpandas

解决方案


d = df.groupby(by =['a', 'b']).v.apply(list).unstack().to_dict('index')
d = {k: {kk: vv for kk, vv in v.items() if vv is not None} for k, v in d.items()}

# d == {1: {0: [2], 1: [4, 3]}, 2: {0: [7], 1: [6]}, 3: {0: [5, 9], 1: [3], 4: [2]}, 4: {0: [5], 1: [4]}, 5: {0: [3], 3: [2]}}

d如果您在第二行中替换为您的df链,您也可以在一行中执行此操作。


推荐阅读