首页 > 解决方案 > 如何有效地在熊猫中映射新变量

问题描述

这是我的数据

Id  Amount
1   6
2   2
3   0
4   6

我需要的是映射: if Amountis more than 3, Mapis 1。但是,如果 Amount小于3Map0

Id  Amount   Map
1   6        1
2   2        0
3   0        0
4   5        1

我做了什么

a = df[['Id','Amount']]
a = a[a['Amount'] >= 3]
a['Map'] = 1
a = a[['Id', 'Map']]
df=  df.merge(a, on='Id', how='left')
df['Amount'].fillna(0)

它有效,但不是高度可配置且无效。

标签: pythonpandasdataframe

解决方案


将布尔掩码转换为整数:

#for better performance convert to numpy array
df['Map'] = (df['Amount'].values >= 3).astype(int)
#pure pandas solution
df['Map'] = (df['Amount'] >= 3).astype(int)
print (df)
   Id  Amount  Map
0   1       6    1
1   2       2    0
2   3       0    0
3   4       6    1

性能

#[400000 rows x 3 columns]
df = pd.concat([df] * 100000, ignore_index=True)

In [133]: %timeit df['Map'] = (df['Amount'].values >= 3).astype(int)
2.44 ms ± 97.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [134]: %timeit df['Map'] = (df['Amount'] >= 3).astype(int)
2.6 ms ± 66.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

推荐阅读