首页 > 解决方案 > 如何根据 2 个自定义列表对数据集进行排序?

问题描述

我想根据 2 个自定义词典对 df 进行排序:

custom_dict = {'HC': 0, 'AMG HC': 1, 'S': 2, 'AMG S': 3,
               'HCA':4, 'AMG HCA':5, 'MUP':6, 'AMG MUP':7}
custom_dict2 = {'Offline': 0, 'Online': 1, 'Independent':2}
df4 = df4.sort_values(by=['category','segment'], key=lambda x: x.map(custom_dict, custom_dict2))

我执行这段代码:

df4 = (df4.set_index(['category', 'segment'], append=True)
   .unstack()
   .swaplevel(axis=1)
   .sort_index(level=0, axis=1, ascending=False)
   .reindex([ytd1, ytd2, 'Evolution'], level=1, axis=1)
   .reset_index('category')
#   .sort_index(level=-1, axis=0)
    )

但它不会打印预期的输出。我想要以下顺序:离线的 HC,离线的 AMG HC,在线的 HC,在线的 AMG HC,独立的 HC,独立的 AMG HC,在线的 S,在线的 AMF 等。

你能帮我怎么做吗?

数据集样本:

df4 = {'category': {0: 'HC', 1: 'AMG HC', 2: 'S', 3: 'AMG MUP', 4: 'MUP', 5: 'S', 6: 'AMG S', 7: 'HCA'}, 'segment': {0: 'Offline', 1: 'Offline', 2: 'Offline', 3: 'Offline', 4: 'Online', 5: 'Online', 6: 'Offline', 7: 'Independent'}, 'ytd2020': {0: '101142', 1: '38541', 2: '55653', 3: '19561', 4: '84921', 5: '99301', 6: '80212', 7: '95731'}, 'ytd2021': {0: '105726', 1: '39463', 2: '57537', 3: '21402', 4: '90310', 5: '97283', 6: '87011', 7: '119289'}, 'Evolution': {0: '4.5%', 1: '2.4%', 2: '2.1%', 3: '4.3%', 4: '8.2%', 5: '-2.0%', 6: '12.4%', 7: '24.6%'}}

标签: pythonpandas

解决方案


您可以将数据设置为有序类别。一旦类别到位,有几个优点,更小的内存消耗和更快的排序:

df4['category'] = pd.Categorical(df4['category'],
                                 categories=list(custom_dict),
                                 ordered=True)
df4['segment'] = pd.Categorical(df4['segment'],
                                categories=list(custom_dict2),
                                ordered=True)

df4 = df4.sort_values(by=['category','segment'])

注意。您不需要此解决方案的字典,具有所需顺序的类别列表就足够了

示例输出(来自随机输入):

   category      segment
11       HC      Offline
14       HC      Offline
1        HC       Online
5        HC  Independent
16       HC  Independent
19   AMG HC      Offline
15   AMG HC       Online
3         S      Offline
4         S      Offline
0         S  Independent
12        S  Independent
9     AMG S       Online
10    AMG S  Independent
2       HCA      Offline
6       HCA      Offline
17      HCA  Independent
7   AMG HCA      Offline
13  AMG HCA  Independent
8       MUP       Online
18  AMG MUP  Independent

推荐阅读