首页 > 解决方案 > 计算具有相同基础名称的列之间的差异

问题描述

我有一个比较新旧数据的df。有没有办法计算新旧数据之间的差异?一般而言,我不想对数据框进行排序,而只想比较具有前缀“_old”和“_new”的根变量

df
     apple_old      daily    banana_new    banana_tree   banana_old apple_new
0      5             3           4              2           10        6

for x in df.columns:
    if x.endswith("_old") and x.endswith("_new"):
        x = x.dif()

预期产出;括号只是为了清楚起见

df_diff
     apple_diff(old-new)         banana_diff(old-new)       
0      -1       (5-6)                      6   (10-4)              

标签: pythonpython-3.xpandas

解决方案


让我们尝试创建一个多索引,然后oldnew.

设置:

import pandas as pd

df = pd.DataFrame({'apple_old': {0: 5}, 'daily': {0: 3}, 'banana_new': {0: 4},
                   'banana_tree': {0: 2}, 'banana_old': {0: 10},
                   'apple_new': {0: 6}})

# Creation of Multi-Index:
df.columns = df.columns.str.rsplit('_', n=1, expand=True).swaplevel(0, 1)
# Subtract old from new:
output_df = (df['old'] - df['new']).add_suffix('_diff')
# Display:
print(output_df)
   apple_diff  banana_diff
0          -1            6

str.rsplit 具有最大拆分长度的多索引,n=1因此_可以安全地处理多个:

df.columns = df.columns.str.rsplit('_', n=1, expand=True).swaplevel(0, 1)
    old   NaN    new   tree    old   new
  apple daily banana banana banana apple
0     5     3      4      2     10     6

然后选择:

df['old']

   apple  banana
0      5      10

df['new']

   banana  apple
0       4      6

减法将按列对齐。然后add_suffix添加_diff到列。


推荐阅读