首页 > 解决方案 > 根据熊猫数据框中的顺序重新排列列

问题描述

我有一个数据框

df_in = pd.DataFrame([[1,2,3,4,5,6,7,8,9]], columns=["ab","ef","cd","ij","klm","kln","ghw","ghx","klo"])

我有另一个数据框,其中定义了订单

df_order = pd.DataFrame([["ab","gh"],["cd","ij"],["ef","kl"]], columns=["col1","col2"])

我想按照以下方式使用 df_order 重新排列数据框 df_in 的列。

col1 中存在的第一个列名,然后是 col2 中以字符串开头的所有列。然后,列名出现在 col1 中,然后所有以字符串开头的列出现在 col2 中,然后是下一行并重复。

预期输出:

df_out = pd.DataFrame([[1,7,8,3,4,2,5,6,9]], columns=["ab","ghw","ghx","cd","ij","ef","klm","kln","klo"])

怎么做?

标签: pythonpython-3.xpandaspython-2.7dataframe

解决方案


Here is a solution you can try out,

from itertools import chain

# create a numeric index for each key to sort latter.
order_ = {
    v: idx for idx, v in enumerate(chain.from_iterable(df_order.to_numpy()))
}

df_in.loc[:, sorted(df_in.columns, key=lambda x: order_[x[:2]])]

   ab  ghw  ghx  cd  ij  ef  klm  kln  klo
0   1    7    8   3   4   2    5    6    9

推荐阅读