首页 > 解决方案 > 比较两个数据帧,并生成输出

问题描述

我有一个名为“df_new”的df,如下所示......

在此处输入图像描述

我有另一个名为“df_old”的df,如下所示......

在此处输入图像描述

(1) 我想检查 df_new 中列出的股票是否在 df_old 中。

如果是,打印股票代码 (SELL) 并显示减去 df_new['Quantity'] - df_old['Quantity']。如果否,则打印股票代码 (BUY) 并显示 df_new['Quantity']。

(2) 我想检查 df_old 中列出的股票是否在 df_new 中。如果是,什么也不做 如果不是,打印股票代码 (SELL) 并显示 df_old['Quantity']。

很抱歉无法提供 df,在 colab 和剪贴板中工作不起作用。

标签: python-3.xpandas

解决方案


我可能不明白你问题的确切意图。正如@RichieV 评论的那样,我df.merge()用来加入并np.where()确定和添加列。

import pandas as pd
import numpy as np
import io

data = '''
Stock Quantity "Buy Price"
AWK 31.0 147.220001
BDX 16.0 281.339996
CDNS 42.0 109.250000
DHR 23.0 203.949997
EXPD 55.0 84.510002
LIN 19.0 245.309998
MAS 81.0 57.180000 
PCAR 54.0 85.000002
RMD 23.0 202.509995
TMO 11.0 413.959991
TSCO 32.0 142.740005
VRSK 24.0 188.710007
WST 17.0 268.809998
'''

df_old = pd.read_csv(io.StringIO(data), sep='\s+')

data2 = '''
Stock Quantity "Buy Price" "Sell Price"
WST 22.0 246.232273 268.809998
CHD 64.0 91.534375 96.330002
RMD 29.0 201.286897 202.509995
TMO 15.0 419.992000 413.959991
LIN 24.0 247.027083 245.309998
BDX 22.0 274.558636 281.339996
CDNS 56.0 106.053929 109.250000
DHR 30.0 199.803000 203.949997
TSCO 42.0 143.4083000 142.740005
AWK 41.0 146.890976 147.220001
'''

df_new = pd.read_csv(io.StringIO(data2), sep='\s+')

# df_new and df_old combine
df = df_new.merge(df_old, on='Stock', suffixes=('_new','_old'), how='outer')
df['Price'] = np.where(~(df['Quantity_new'].isnull())&~(df['Quantity_old'].isnull()), df['Sell Price'], df['Buy Price_new'])
df['diff'] = np.where(~(df['Quantity_new'].isnull())&~(df['Quantity_old'].isnull()), df['Quantity_new'] - df['Quantity_old'], df['Quantity_new'])

# update

df['diff'].loc[df['diff'].isnull()] = df['Quantity_old'].loc[df['Quantity_new'].isnull()]
df = df[['Stock','Quantity_new','Quantity_old', 'diff']]

|    | Stock   |   Quantity_new |   Quantity_old |   diff |
|---:|:--------|---------------:|---------------:|-------:|
|  0 | WST     |             22 |             17 |      5 |
|  1 | CHD     |             64 |            nan |     64 |
|  2 | RMD     |             29 |             23 |      6 |
|  3 | TMO     |             15 |             11 |      4 |
|  4 | LIN     |             24 |             19 |      5 |
|  5 | BDX     |             22 |             16 |      6 |
|  6 | CDNS    |             56 |             42 |     14 |
|  7 | DHR     |             30 |             23 |      7 |
|  8 | TSCO    |             42 |             32 |     10 |
|  9 | AWK     |             41 |             31 |     10 |
| 10 | EXPD    |            nan |             55 |     55 |
| 11 | MAS     |            nan |             81 |     81 |
| 12 | PCAR    |            nan |             54 |     54 |
| 13 | VRSK    |            nan |             24 |     24 |

推荐阅读