首页 > 解决方案 > 如何对列表列表重复一系列过程和计算?

问题描述

我想计算 20 个季度的公司列表的财务指数。每个季度,公司名单都有点不同,所以我有 20 个名单要处理。但是,我不太确定是否需要编写一个循环或函数来迭代所有 20 个列表的过程?我以前从未写过任何循环。我将在此示例流程中使用 2020-Q1 数据,因此对于每个季度,流程都应调整为该季度。

我目前对一个列表的代码如下:

> set API key to extract data from financial API

sf.set_api_key('poepcJpfyS4oMRuSEEytWkErpDmdsa0C')

> get company lists of 20 quarter

SPlist = pd.read_csv('Company_List.csv')

SPlist 如下所示,我已将每一列转换为一个列表,因此我得到了所有季度的 20 个列表。


    3/31/20   12/31/19   9/30/19   6/30/19   3/31/19   12/31/18
0   A         A          A         A         A          A
1   AAL       AAL        AAL       AAL       AAL        AAL 
2   AAP       AAP        AAP       AAP       AAP        AAP 
3   AAPL      AAPL       AAPL      AAPL      AAPL       AAPL    
4   ABBV      ABBV       ABBV      ABBV      ABBV       ABBV
... ...       ...        ...       ...       ...        ... 
500 YUM       YUM        XYL       XYL       XYL        XYL
501 ZBH       ZBH        YUM       YUM       YUM        YUM
502 ZBRA      ZBRA       ZBH       ZBH       ZBH        ZBH
503 ZION      ZION       ZION      ZION      ZION       ZION
504 ZTS       ZTS        ZTS       ZTS       ZTS        ZTS

您可以使用以下代码找到示例列表,它会为您提供最近一个季度的公司列表。

table=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
tickers = df.Symbol.to_list()

我为这个单一列表运行以下过程:

> Load data from API

sf.set_data_dir('~/simfin_data/')

df = sf.load_income(variant='quarterly', market='us')
dfb = sf.load_balance(variant='quarterly-full', market='us')
dfp = sf.load_shareprices(variant='daily', market='us')

> Only show companies within the ticker list

df_income_qtr = df.loc[tickers]
df_balance_qtr = dfb.loc[tickers]
df_price = dfp.loc[tickers]

> Add NOPAT and Invested Capital in order to get ROIC

df_income_qtr['NOPAT'] = df_income_qtr['Operating Income (Loss)'] - df_income_qtr['Income Tax (Expense) Benefit, Net']
df_balance_qtr['Invested Capital'] = df_balance_qtr['Total Noncurrent Liabilities'] + df_balance_qtr['Total Equity']

> Calculate ROIC

income = df_income_qtr[['Fiscal Year','Publish Date','NOPAT']]
balance = df_balance_qtr[['Invested Capital']]
income2=income.reset_index()
balance2 = balance.reset_index()
combine = pd.merge(income,balance,left_index=True, right_index=True)
combine['ROIC'] = combine['NOPAT']/combine['Invested Capital']

> Get quarterly results by publish date

combine['Publish Date'] = pd.to_datetime(combine['Publish Date']).dt.strftime('%Y-%m-%d')
combine['Year_Quarter'] = pd.to_datetime(combine['Publish Date']).dt.to_period('Q').astype(str).str.replace('Q', '-Q')

> Get Q1-2020 ROIC data from the table

Q12020roic = combine[combine['Year_Quarter'] == '2020-Q1']
Q12020roic = Q12020roic[Q12020roic['ROIC'].notnull()]
Q12020roic = Q12020roic.reset_index()

> Get Q1-2020 price data from the table

df_price = df_price.reset_index()
df_price['Market Cap'] = df_price['Adj. Close'] * df_price['Shares Outstanding']
tmp = df_price.groupby(['Ticker', pd.PeriodIndex(df_price['Date'], freq='Q', name='Quarter')]).last().reset_index(level = 0)
individual_df = {str(i): df_price.reset_index(drop=True) for i, df_price in tmp.groupby(['Quarter'])}
Q12020price = individual_df['2020Q1'][['Ticker','Adj. Close','Market Cap']]

> Get a combined table for calculation

Q12020full = Q12020roic.set_index('Ticker').join(Q12020price.set_index('Ticker'))
mktcap = Q12020full['Market Cap'].sum()
Q12020full['Weight'] = Q12020full['Market Cap']/mktcap
qua1, qua2, qua3, qua4, qua5 = np.split(Q12020full[:395].sort_values(by = 'ROIC', ascending = False), 5)
index1 = qua1['Weight'].sum()
index2 = qua2['Weight'].sum()
index3 = qua3['Weight'].sum()
index4 = qua4['Weight'].sum()
index5 = qua5['Weight'].sum()

> This Is the final output I want to get for each quarter:

index1,index2,index3,index4,index5
>(0.33722854216331616,
 0.18763248199902477,
 0.20345912394591914,
 0.13369812569957581,
 0.1317699556564322)

这是一个复杂的过程,我花了一些时间才完成了四分之一。现在我正在努力重复这个过程 20 个季度。可能很难直接迭代 20 个季度,因为您需要更改几个变量的季度,但是有没有办法将此代码的某些部分简化为循环或其他内容?

标签: pythonpandasfunctionloopsdataframe

解决方案


如果数据的结构没有改变,我建议将代码包装成函数,然后将这些函数用于您必须执行这些任务的每一家公司。显然,您必须遍历您希望计算其财务指数的公司池。

我假设当你说“公司名单有点不同”时,我们不是在谈论结构,而是在谈论长度。


推荐阅读