首页 > 解决方案 > 为什么python在算法中只打印一个数据集?

问题描述

所以我正在尝试构建一个交易软件,并且我正在使用来自在线 YouTuber 的代码。我正在 get_data_from_yahoo() 函数中收集标准普尔 500 指数公司的所有数据。因此,当我运行该代码时,它显示“已经拥有”(然后是给定的代码),这很好,但是当我在下面的函数 compile_data() 中为此打印数据时,它只打印一个代码,即 ZTS。有人有想法么?

import bs4 as bs
import datetime as dt
import os
import pandas as pd
from pandas_datareader import data as pdr    
import pickle
import requests
import fix_yahoo_finance as yf


def save_sp500_tickers():
    resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    table = soup.find('table', {'class': 'wikitable sortable'})
    tickers = []
    for row in table.findAll('tr')[1:]:
        ticker = row.findAll('td')[0].text.replace('.', '-')
        ticker = ticker[:-1]
        tickers.append(ticker)
    with open("sp500tickers.pickle", "wb") as f:
        pickle.dump(tickers, f)

    print(tickers)

    return tickers


save_sp500_tickers()

def get_data_from_yahoo(reload_sp500=False):

    if reload_sp500:
    tickers = save_sp500_tickers()
else:
    with open("sp500tickers.pickle", "rb") as f:
        tickers = pickle.load(f)

if not os.path.exists('stock_dfs'):
    os.makedirs('stock_dfs')

start = dt.datetime(2019, 6, 8)
end = dt.datetime.now()

for ticker in tickers:
        print(ticker)
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            df = pdr.get_data_yahoo(ticker, start, end)
            df.reset_index(inplace=True)
            df.set_index("Date", inplace=True)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))


save_sp500_tickers()
get_data_from_yahoo()

def complied_data():
        with open("sp500tickers.pickle","rb") as f:
            tickers = pickle.load(f)

    main_df = pd.DataFrame()

    for count, ticker in enumerate(tickers):
        df = pd.read_csv('stock_dfs/{}.csv'.format(ticker))
        df.set_index('Date', inplace=True)

        df.rename(columns = {'Adj Close':ticker}, inplace=True)
        df.drop(['Open', 'High', 'Low','Close','Volume'], 1, inplace=True)

    if main_df.empty:
        main_df = df
    else:
        main_df = main_df.join(df, how='outer')

    if count % 10 == 0:
        print(count)

    print(main_df.head())
    main_df.to_csv('sp500_joined_closes.csv')

complied_data()

当我运行这段代码时,它是这样说的:

MMM
Already have MMM
ABT
Already have ABT
ABBV
Already have ABBV
ABMD
Already have ABMD
ACN
Already have ACN
ATVI
Already have ATVI
ADBE
Already have ADBE
AMD
Already have AMD
AAP
Already have AAP
AES
Already have AES
AMG
Already have AMG
AFL
Already have AFL
A
Already have A
APD
Already have APD
AKAM
Already have AKAM
ALK
Already have ALK
ALB
Already have ALB

然后它继续说它已经拥有所有 500 家公司(我没有显示漏洞,因为列表很长)。但是当我运行 compile_data() 函数时,它只打印一个代码的数据:

ZTS
Date                 
2019-01-02  83.945038
2019-01-03  81.043526
2019-01-04  84.223267
2019-01-07  84.730026
2019-01-08  85.991997

标签: pythonpandastradingpandas-datareader

解决方案


问题出在 for 循环中,特别是complied_data.

if-else 和 if 块应该包含在 for 循环中:

for count, ticker in enumerate(tickers):
    df = pd.read_csv('stock_dfs/{}.csv'.format(ticker))
    df.set_index('Date', inplace=True)

    df.rename(columns = {'Adj Close':ticker}, inplace=True)
    df.drop(['Open', 'High', 'Low','Close','Volume'], 1, inplace=True)
    if main_df.empty:
        main_df = df
    else:
        main_df = main_df.join(df, how='outer')

    if count % 10 == 0:
        print(count)

否则,只有在完成循环并详细说明最后一个元素之后,才会对它们进行评估。

以下是更改为上述缩进时的输出:

(... omitted counting from 0)
470
480
490
500
                   MMM        ABT       ABBV        ABMD  ...         YUM         ZBH       ZION         ZTS
Date                                                      ...
2019-06-10  165.332672  80.643486  74.704918  272.429993  ...  107.794380  121.242027  43.187107  109.920105
2019-06-11  165.941788  80.494644  75.889320  262.029999  ...  106.722885  120.016762  43.758469  109.860268
2019-06-12  166.040024  81.318237  76.277657  254.539993  ...  108.082100  120.225945  43.512192  111.136780
2019-06-13  165.882843  81.655624  76.646561  255.529999  ...  108.121788  119.329407  44.063854  109.730621
2019-06-14  163.760803  81.586166  76.394157  250.960007  ...  108.925407  116.998398  44.211620  110.488556

[5 rows x 505 columns]

推荐阅读