首页 > 解决方案 > 展平的数据透视表值应与列相邻

问题描述

展平后,我得到 (Completed, A), (Completed, B), (Completed, C) 而我希望它是 (Completed, A), (Not_Completed, A), (Completed, B), (Not_Completed, B )。

x = pd.DataFrame({'P':['A','B', 'C'], 'id' : ['1100', '1101', '1102'], 'Completed' : [10,4,8], 'Not_Completed' : [6,2,2]})


   P    id  Completed  Not_Completed  

0  A    1100         10            6

1  B    1101          4            2

2  C    1102          8            2

x = x.sort_values(by = 'Completed',ascending = False)
x1 = x.pivot_table(index='id', columns=['P'], values=['Completed', 'Not_Completed'], fill_value = 0)
flattened = pd.DataFrame(x1.to_records())
flattened

实际列:

id (Completed, A), (Completed, B), (Completed, C), (Not_Completed, A), (Completed, B), (Not_Completed, C)

预期列:

id (Completed, A), (Not_Completed, A), (Completed, C), (Not_Completed, C), (Completed, B), (Not_Completed, B)

标签: pythonpandaspivot-table

解决方案


x1您可以在展平之前对 's 列进行排序:

flattened = pd.DataFrame(x1.sort_index(level=1, axis=1)
                           .to_records())
flattened

给你你想要的(给定A, B, C的)。

如果您有订单字典:

orders = {'A':10, 'B':4, 'C':8}

然后您可以按该顺序对列进行排序:

cols = sorted(list(x1.columns), 
              key=lambda x: (-orders[x[1]], x[0]) 
             )

flattened = pd.DataFrame(x1[cols].to_records())

推荐阅读