首页 > 解决方案 > How to iterate an arbitrary function using vectorisation

问题描述

I would like to iterate a given function in a pandas dataframe without using a for loop, i.e using vectorisation.

I have already written a for loop for this function but I would like to improve the efficiency of this output.

def f(x,y,operation):
    if operation=='add':
        return x+y
    elif operation=='power':
        return x**y
    else:
        print('type can only be power or add')
df = pd.DataFrame({
              'first_entry':[1,np.nan,np.nan,np.nan,np.nan],
              'operation':[np.nan,'plus','power','plus','plus'],
              'operand':[np.nan,3,2,4,1]
              })
first_entry operation operand       expected_result
1           NA        NA            1
NA          plus      3             4 (= 1+3)
NA          power     2             16 (=4**2)
NA          plus      4             20 (=16+4)
NA          plus      1             21 (=20+1)

I want to return pd.Series(1,4,16,20,21), i.e. iterate f over the dataframe

Alternative question: Suppose now

def g(x,y,operation):
    if operation=='relative':
        return x*(1+y)
    elif operation=='absolute':
        return x+y
    else:
        print('type can only be relative or absolute')

Can I write a function with list comprehension to give the expected result?

first_entry operation operand       expected_result
1           NA            NA            1
NA          relative      3             4 (= 1*(3+1)
NA          absolute      2             6 (=4+2)
NA          relative      4             30 (=6*(4+1)
NA          absolute      1             31 (=30+1)

标签: pythonpandasvectorization

解决方案


我没有直接得到a,b,c之间的关系。但是您可以使用 Pandasapply函数ApplyApply Map吗?

在一个非常高的层次上,有类似的东西:

def f(row):
    if row["type"] == "add":
        return row["a"] + row["b"]
    elif row["type"] == "power":
        return row["a"] ** row["b"]

df["res"] = df.apply(f, axis=1)

这假设您的列分别命名为“a”、“b”和“type”。


推荐阅读