首页 > 解决方案 > 使用 .loc 分配多列 - Pandas

问题描述

我有一个问题,我无法使用“.loc”分配多个列。
我想用一条线来做。

例子

数据帧输入:

    NAME   AGE  NEW_AGE COUNTRY NEW_COUNTRY     _merge
0  LUCAS  80.0      NaN  BRAZIL         NaN  left_only
1  STEVE   NaN     35.0     NaN         USA       both
2    BEN   NaN     25.0              CANADA       both

数据帧输出:

    NAME   AGE  NEW_AGE COUNTRY NEW_COUNTRY     _merge
0  LUCAS  80.0      NaN  BRAZIL         NaN  left_only
1  STEVE  35.0     35.0     USA         USA       both
2    BEN  25.0     25.0  CANADA      CANADA       both

代码

import pandas as pd

people = pd.DataFrame(
    {'NAME': ['LUCAS', 'STEVE', 'BEN'],
     'AGE': [80, pd.np.nan, pd.np.nan],
     'NEW_AGE': [pd.np.nan, 35, 25],
     'COUNTRY': ['BRAZIL', pd.np.nan, ''],
     'NEW_COUNTRY': [pd.np.nan, 'USA', 'CANADA'],
     '_merge': ['left_only', 'both', 'both']
     })


people.loc[people['_merge'] == 'both', 'AGE'] = people['NEW_AGE']
people.loc[people['_merge'] == 'both', 'COUNTRY'] = people['NEW_COUNTRY']

我尝试过这种方式,但失败了。

# USING ONLY ONE DOESNT WORK
people.loc[people['_merge'] == 'both', ['AGE', 'COUNTRY']] = \
 people[['NEW_AGE', 'NEW_COUNTRY']]

# USING TO_NUMPY CAUSE OF http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html
people.loc[people['_merge'] == 'both', ['AGE', 'COUNTRY']] = \
 people[['NEW_AGE', 'NEW_COUNTRY']].to_numpy()

有谁知道如何使用一个命令分配多个列?

熊猫:0.24.1

谢谢。

标签: pythonpython-3.xpandasnumpy

解决方案


用于rename具有 lambda 函数的相同列名:

f = lambda x: x.replace('NEW_','')
df = people[['NEW_AGE', 'NEW_COUNTRY']].rename(columns=f)
people.loc[people['_merge'] == 'both', ['AGE', 'COUNTRY']] = df
print (people)
    NAME   AGE  NEW_AGE COUNTRY NEW_COUNTRY     _merge
0  LUCAS  80.0      NaN  BRAZIL         NaN  left_only
1  STEVE  35.0     35.0     USA         USA       both
2    BEN  25.0     25.0  CANADA      CANADA       both

推荐阅读