首页 > 解决方案 > 用 pandas 的其他行中的值填充一列

问题描述

有人可以帮我理解吗?

让我们拥有这个 DataFrame:

df = pd.DataFrame({
    "id": ['a', 'b', 'c', 'd', 'e'],
    "parent_id": [None, None, 'a', 'b', 'a'],
    "name": ["Bob", "Jane", "John", "Patty", "Sam"],
})

现在,我想检索每个子名称旁边的父名称,如下所示:

+----+-----------+-------+-------------+
| id | parent_id | name  | parent_name |
+----+-----------+-------+-------------+
| a  | None      | Bob   | NaN         |
+----+-----------+-------+-------------+
| b  | None      | Jane  | NaN         |
+----+-----------+-------+-------------+
| c  | a         | John  | Bob         |
+----+-----------+-------+-------------+
| d  | b         | Patty | Jane        |
+----+-----------+-------+-------------+
| e  | a         | Sam   | Bob         |
+----+-----------+-------+-------------+

所以我这样做:

df['parent_name'] = None
df['parent_name'] = df['parent_id'].apply(lambda x: df['name'][df['id']==x])

但这是我得到的:

+----+-----------+-------+-------------+
| id | parent_id | name  | parent_name |
+----+-----------+-------+-------------+
| a  | None      | Bob   | NaN         |
+----+-----------+-------+-------------+
| b  | None      | Jane  | NaN         |
+----+-----------+-------+-------------+
| c  | a         | John  | Bob         |
+----+-----------+-------+-------------+
| d  | b         | Patty | NaN         |
+----+-----------+-------+-------------+
| e  | a         | Sam   | Bob         |
+----+-----------+-------+-------------+

因此,它似乎只处理name列中的第一项。

用柏拉图的话引用苏格拉底的话:“WTF???”

标签: pythonpandasdataframe

解决方案


我们可以尝试将与基于 common的对应映射parent_idparent_nameid

df['parent_name'] = df['parent_id'].map(df.set_index('id')['name'])

  id parent_id   name parent_name
0  a      None    Bob         NaN
1  b      None   Jane         NaN
2  c         a   John         Bob
3  d         b  Patty        Jane
4  e         a    Sam         Bob

推荐阅读