python - 如何在 Python 中按两列对 DataFrame 进行排序?
问题描述
我有数据帧 df(TCP 数据包),包括四列服务器、客户端、seq、ack。例如,
server client seq ack
A B 207876062 2372538506
A B 207876089 2372538616
B A 2372538590 207876089
A B 207876062 2372538590
B A 2372538506 207876062
我想依次按列 seq 和 ack 排序:
server client seq ack
A B 207876062 2372538506
B A 2372538506 207876062
A B 207876062 2372538590
B A 2372538590 207876089
A B 207876089 2372538616
有什么方法可以按正确的顺序排序吗?
谢谢
解决方案
考虑到df是您要处理的数据框,我会这样做:
# Step 1 split dataframes between two sub-dataframes
df_a = df[df['server'] == 'A']
df_b = df[df['server'] == 'B']
# Step 2 sorting sub-dataframes by fields seq and ack
df_a = df_a.sort_values(by=['seq', 'ack'])
df_b = df_b.sort_values(by=['seq', 'ack'])
# Step 3 adding a sorting key
df_a['sorting_key'] = range(1, df_a.shape[0] + 1)
df_b['sorting_key'] = range(1, df_b.shape[0] + 1)
# Step 4 shifting the sorting key for the second dataframe
df_b['sorting_key'] = df_b['sorting_key'].apply(lambda x: x + 0.5)
# Step 5 Concatenate the two dataframe and sorting them by the sorting key
df_c = pd.concat([df_a, df_b]).sort_values(by=['sorting_key'])
# Step 6 Clean up a bit the result
df_c = df_c.reset_index(drop=True).drop(['sorting_key'], axis=1)
更新
如果您不知道有多少台服务器,只需添加如下循环:
# Step 1 split dataframes between two sub-dataframes
sub_df = []
for e in set(df['server']):
sub_df.append(df[df['server'] == e])
# Step 2 sorting sub-dataframes by fields seq and ack and adding a sorting key
sub_df_1 = []
for tdf in sub_df:
tdf = tdf.sort_values(by=['seq', 'ack'])
tdf['sorting_key'] = range(1, tdf.shape[0] + 1)
sub_df_1.append(tdf)
# Step 3 shifting the sorting key for the second dataframe
sub_df_2 = [sub_df_1[0]]
delta = 0.1
for tdf in sub_df_1[1:]:
tdf['sorting_key'] = tdf['sorting_key'].apply(lambda x: x + delta)
delta += delta / 10
sub_df_2.append(tdf)
# Step 4 Concatenate the two dataframe and sorting them by the sorting key
df_c = pd.concat(sub_df_2).sort_values(by=['sorting_key'])
# Step 5 Clean up a bit the result
df_c = df_c.reset_index(drop=True).drop(['sorting_key'], axis=1)
祝你好运
推荐阅读
- php - 如何在节点保存在 Drupal 8 之前操作值?
- c# - DispatcherTimer 堆叠 - UWP
- google-chrome - Chrome 控制台自动清除日志
- java - Apache Camelquartz2 cron 失火
- jquery - 如何多次使用日期选择器?
- magento2 - Magento 2.1.9 覆盖目录规则表单
- neural-network - 在 GAN 中为生成器模型生成初始随机向量的正确方法?
- javascript - 在数组的同一索引处推送多个元素
- javascript - 向 Webpack、Babel 和 Eslint 添加 Flow 类型检查
- javascript - 在 React 中将组件定义为对象