首页 > 解决方案 > Pandas 过滤并创建新列

问题描述

我有一个熊猫 df:

import pandas as pd
import numpy as np
df = pd.DataFrame(['Air type:1', 'Space kind:2', 'water', np.NaN], columns = ['A'])

      A
0   Air type:1
1   Space kind:2
2   water
3   NaN

我想将 A 中包含“:”的条目拆分为两个新列。因此,我尝试将此操作与 .loc 过滤器结合使用:

df.loc[(df.A.str.contains(':')) & (~df.A.isnull()), ['B', 'C']] = df.A.str.split(':', expand = True)

结果不是很有希望:

     A            B       C
0   Air type:1   NaN    NaN
1   Space kind:2 NaN    NaN
2   water        NaN    NaN
3   NaN          NaN    NaN

如果我不过滤它会起作用:

df[['B', 'C']] = df.A.str.split(':', expand = True)

           A           B        C
0   Air type:1      Air type    1
1   Space kind:2    Space kind  2
2   water             water    None
3   NaN                NaN     NaN

问题是该water条目被错误地分配给新列,之后我必须手动修复它。

为什么.loc+ 分配不起作用?

理想情况下,我想得到:

           A           B        C
0   Air type:1      Air type    1
1   Space kind:2    Space kind  2
2   water              NaN     NaN
3   NaN                NaN     NaN

标签: pythonpandas

解决方案


尝试使用以下条件进行检查df.where

c  = c = df['A'].str.contains(":")
#c = df['A'].str.count(":").ge(1)
df[['B', 'C']] = df['A'].str.split(":",expand=True).where(c)

print(df)
              A           B    C
0    Air type:1    Air type    1
1  Space kind:2  Space kind    2
2         water         NaN  NaN
3           NaN         NaN  NaN

推荐阅读