首页 > 解决方案 > 如何使用矢量化迭代行

问题描述

list12这是我的代码,用于使用我定义的函数在我的数据框中创建一个名为“Re”的新列,findindex其中变量B是满足条件的每一行的相应索引(list12['FPI']==6) & (list12['Firm Name']==x[I])。file 和 file3 的头行如下所示。

CUSIP   Firm Name   Date Record FPI forecasting period(days)    Median
0   00846U10    A   1999-12-16  0   1825    20.50
1   00846U10    A   1999-12-16  6   46  0.23
2   00846U10    A   1999-12-16  7   136 0.30
3   00846U10    A   1999-12-16  8   228 0.29
4   00846U10    A   1999-12-16  1   320 1.11

Global Company Key  Data Date   Fiscal Year Fiscal Quarter  Ticker Symbol   CUSIP   Common Shares Outstanding   Earnings Per Share (Basic) - Including Extraordinary Items  Earnings Per Share (Basic) - Excluding Extraordinary Items  Earnings Per Share (Basic) - Including Extraordinary Items.1
156794  126554  1998/01/31  1998    1   A   00846U101   NaN 0.42    0.42    0.42
156795  126554  1998/04/30  1998    2   A   00846U101   NaN 0.24    0.24    0.67
156796  126554  1998/07/31  1998    3   A   00846U101   NaN 0.14    0.14    0.81
156797  126554  1998/10/31  1998    4   A   00846U101   NaN -0.13   -0.13   0.68
156798  126554  1999/01/31  1999    1   A   00846U101   NaN 0.19    0.19    0.19

x = ['A','AA','AABA','AAL','AAP','AAPL','ABBV','ABC','ABMD','ABT']
list12 = file.loc[file['Firm Name'].isin(x)]
#file3
def findindex(list12,A,B):
    diff = pd.to_datetime(file3[file3['Ticker Symbol']==A]['Data Date']) - pd.to_datetime(list12.loc[B,'Date Record'])
    if len(diff[(diff < pd.to_timedelta(0))])>0:
        indexmax = (diff[(diff < pd.to_timedelta(0))].idxmax())
        #list11.loc[index,'N'] = file3.loc[indexmax,'Data Date']
        list12.loc[B,'Re']= list12.loc[B,'Median']+file3.loc[indexmax-1,'Earnings Per Share (Basic) - Including Extraordinary Items']+file3.loc[indexmax-2,'Earnings Per Share (Basic) - Including Extraordinary Items']+file3.loc[indexmax-3,'Earnings Per Share (Basic) - Including Extraordinary Items']
    return list12.loc[B,'Re']
    
    
    
    
list12['Re']=""
for i in range(len(x)):
    list12.loc[(list12['FPI']==6) & (list12['Firm Name']==x[i]), 'Re']=findindex(list12,x[i],(list12['FPI']==6) & (list12['Firm Name']==x[i]))
list12    

我试图对代码进行矢量化并避免使用 for 循环。我遇到的问题是当我运行代码时它返回一个空的“Re”列。我假设我没有正确地迭代行,但我不知道如何修复它。任何帮助,将不胜感激。

标签: pandas

解决方案


推荐阅读