python - pandas: sort by the order of appearance
问题描述
Suppose we have a dataframe:
df = pd.DataFrame(pd.np.zeros((15,10,)), dtype=int, \
index=[['a']*5+['b']*5+['c']*5, list(range(15))])
df.index.names=['index0', 'index1']
pd.np.random.seed(0)
for i, v in df.iterrows():
v.loc[pd.np.random.randint(10)] = 1
df
0 1 2 3 4 5 6 7 8 9
index0 index1
a 0 0 0 0 0 0 1 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0
2 0 0 0 1 0 0 0 0 0 0
3 0 0 0 1 0 0 0 0 0 0
4 0 0 0 0 0 0 0 1 0 0
b 5 0 0 0 0 0 0 0 0 0 1
6 0 0 0 1 0 0 0 0 0 0
7 0 0 0 0 0 1 0 0 0 0
8 0 0 1 0 0 0 0 0 0 0
9 0 0 0 0 1 0 0 0 0 0
c 10 0 0 0 0 0 0 0 1 0 0
11 0 0 0 0 0 0 1 0 0 0
12 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 1 0
14 0 1 0 0 0 0 0 0 0 0
How to sort first the rows in the blocks a, b and c by the order of appearance of "1" and then sort also a, b and c?
Expected output:
0 1 2 3 4 5 6 7 8 9
index0 index1
a 1 1 0 0 0 0 0 0 0 0 0
2 0 0 0 1 0 0 0 0 0 0
3 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0
4 0 0 0 0 0 0 0 1 0 0
c 14 0 1 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 1 0
b 8 0 0 1 0 0 0 0 0 0 0
6 0 0 0 1 0 0 0 0 0 0
9 0 0 0 0 1 0 0 0 0 0
7 0 0 0 0 0 1 0 0 0 0
5 0 0 0 0 0 0 0 0 0 1
Edit: The values can be other than "1", in reality these are different text values.
解决方案
One approach is to use pandas.DataFrame.groupby
with idxmax
and sort_values
:
import pandas as pd
l = (d.loc[d.idxmax(1).sort_values().index] for _, d in df.groupby('index0'))
new_df = pd.concat(sorted(l, key= lambda x:list(x.sum()), reverse=True))
print(new_df)
Output:
0 1 2 3 4 5 6 7 8 9
index0 index1
a 1 1 0 0 0 0 0 0 0 0 0
2 0 0 0 1 0 0 0 0 0 0
3 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0
4 0 0 0 0 0 0 0 1 0 0
c 14 0 1 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 1 0
b 8 0 0 1 0 0 0 0 0 0 0
6 0 0 0 1 0 0 0 0 0 0
9 0 0 0 0 1 0 0 0 0 0
7 0 0 0 0 0 1 0 0 0 0
5 0 0 0 0 0 0 0 0 0 1
In case the 1
s are texts and rest is same, try using pandas.Dataframe.ne
tmp = df.ne(0)
# same operation
new_df = df.loc[new_tmp.index]
推荐阅读
- java - 在 TimeoutMainSubscriber 的情况下将 Reactor Context 复制到 MDC
- sql - FTS 无法按预期处理带有点的电子邮件
- c# - CORS 错误仅适用于一个控制器
- sql - 如何删除 Ignite 的 SQL 表中的批量行
- c# - 如何处理 Npgsql 中的 text[] 数组字段类型?
- appium - 如何在 appium for chrome android 中实现点击操作
- typo3 - Typoscript 9.5 渲染文件对象,如 Contentelement
- c++ - 由于方法重载,C++ 移动语义和代码重复
- javascript - 为什么这个异步函数没有等待
- javascript - 将对象数组分解为用于数据库存储的单独变量(React、express、sql server)