首页 > 解决方案 > 基于相同行和其他行标记 Pandas 行

问题描述

我想添加一个新列来标记数据框中的行。该标志应该按照以下逻辑完成:

  1. 具有相同 ID 的行属于同一行,应以相同方式标记
  2. 该标志由四个单元格构成,具有“receive”、“float”或“fixed”、“, pay”和“float”或“fixed”的不同组合

我认为一个例子可能会更清楚地说明这一点。这将是原始数据框。

df = pd.DataFrame(data=[[2, 'fix','receive'], [2, 'float','pay'], [3, 'fix','receive'], [3, 'fix','pay'], [7, 'float','pay'], [7, 'float','receive']], columns=["ID", "Structure","Leg"])

这应该是应用上述逻辑并创建一个标记每一行的新列之后的结果:

df["Flag"] = ["Receive fix, pay float", "Receive fix, pay float", "Receive fix, pay fix","Receive fix, pay fix","Receive float, pay float","Receive float, pay float"]

所以,我的主要问题是如何遍历数据框以找到具有相同 ID 的两行,然后使用这两行中的信息为两行中的每一行创建相同的标志。非常感谢您的想法。

我不知道这是否朝着正确的方向发展,但这是我的尝试。问题仍然是如何从具有相同 ID 的第二行获取数据。

df["Flag"] = "???"
for index, row in df.iterrows():
    if row["Leg"] == "receive":
        df.at[index, "Flag"] = row["Leg"] + " " + row["Structure"] + ", pay ?"

标签: python-3.xpandasdataframe

解决方案


首先DataFrame.sort_values按两列排序,然后创建新列并最后使用GroupBy.transformjoin列:

df = df.sort_values(['ID','Leg'], ascending=[True, False])
df['new'] = df["Leg"] + " " + df["Structure"]
df["Flag"] = df.groupby('ID')['new'].transform(', '.join)
print (df)
   ID Structure      Leg                      Flag            new
0   2       fix  receive    receive fix, pay float    receive fix
1   2     float      pay    receive fix, pay float      pay float
2   3       fix  receive      receive fix, pay fix    receive fix
3   3       fix      pay      receive fix, pay fix        pay fix
5   7     float  receive  receive float, pay float  receive float
4   7     float      pay  receive float, pay float      pay float

使用助手的解决方案Series

df = df.sort_values(['ID','Leg'], ascending=[True, False])
s = df["Leg"] + " " + df["Structure"]
df["Flag"] = s.groupby(df['ID']).transform(', '.join)
print (df)
   ID Structure      Leg                      Flag
0   2       fix  receive    receive fix, pay float
1   2     float      pay    receive fix, pay float
2   3       fix  receive      receive fix, pay fix
3   3       fix      pay      receive fix, pay fix
5   7     float  receive  receive float, pay float
4   7     float      pay  receive float, pay float

推荐阅读