首页 > 解决方案 > 如何根据行中的值添加计算列

问题描述

我想根据行中的值添加一个计算列。

我的数据看起来像:

df.head(5)

CountryCode question_code   answer  percentage
0   Austria b1_a    Very widespread 8
1   Austria b1_a    Fairly widespread   34
2   Austria b1_a    Fairly rare 45
3   Austria b1_a    Very rare   9
4   Austria b1_a    Don`t know  4

我试过:

def scoring(df):
    df.answer == 'Very widespread':
        return df.percentage*(-2)
    df.answer == 'Fairly widespread':
        return df.percentage*(-1)
    df.answer == 'Fairly rare':
        return df.percentage
    df.answer == 'Very rare':
        return df.percentage*2
    df.answer == 'Don`t know':
        return 0

产生:

文件“”,第 3 行 df.answer == '非常普遍':^ SyntaxError:无效语法

帮助将不胜感激。

标签: pythonpandas

解决方案


你好像忘记写ifs和elifs了:

def scoring(row):
    if row.answer == "Very widespread":
        return row.percentage*(-2)
    elif row.answer == "Fairly widespread":
        return row.percentage*(-1)
    elif row.answer == "Fairly rare":
        return row.percentage
    elif row.answer == "Very rare":
        return row.percentage*2
    elif row.answer == "Don\"t know":
        return 0

wheredf被重命名为,row因为apply将每一行而不是整个框架传递给它,你可以这样做:

df["scores"] = df.apply(scoring, axis=1)

要得到

>>> df

  CountryCode question_code             answer  percentage  scores
0     Austria          b1_a    Very widespread           8     -16
1     Austria          b1_a  Fairly widespread          34     -34
2     Austria          b1_a        Fairly rare          45      45
3     Austria          b1_a          Very rare           9      18
4     Austria          b1_a         Don't know           4       0

但更好的是,我们可以预先生成一个乘数映射及其map所在的answer列:

mapping = {"Very widespread": -2,
           "Fairly widespread": -1,
           "Fairly rare": 1,
           "Very rare": 2,
           "Don't know": 0}

将 s 与 this 映射后answer,结果可以与percentages 相乘:

df["scores"] = df.answer.map(mapping).mul(df.percentage)

这给出了与上面相同的结果。


推荐阅读