首页 > 解决方案 > 从类中向 Pandas DataFrame 添加列

问题描述

我希望提高我的 OOP 技能,并编写了一个脚本来提取股票数据并运行一些简单的统计数据。我能够单独运行和调用评估类中的每个函数(在下面评论),但是在尝试遍历代码列表并将统计信息附加到初始数据帧时遇到问题。

import datetime as d
import pandas as pd
import pandas_datareader.data as web
import numpy as np

start = d.datetime(2019, 1, 1)
end = d.datetime(2020, 4, 17)

class Security(object):
    def _init__(self, ticker, data_platform, start_date, end_date):
        self.ticker = ticker
        self.data_platform = data_platform
        self.start_date = start_date
        self.end_date = end_date

    def fetch_stock_data(self, ticker, data_platform, start_date, end_date):
        df = web.DataReader(ticker, data_platform, start_date, end_date)
        return df


class Evaluation(Security):

    def __init__(self, ticker, data_platform, start_date, end_date):
        self.df = Security.fetch_stock_data(
            self, ticker, data_platform, start_date, end_date)

    def simple_moving_average(self, period):
        df = self.df
        df['SMA-{}'.format(period)] = df['Adj Close'].rolling(period).mean()
        return df['SMA-{}'.format(period)]

    def exp_moving_average(self, period):
        df = self.df
        df['EMA_{}'.format(period)] = df['Adj Close'].ewm(span=period).mean()
        return df['EMA_{}'.format(period)]

    def rsi(self, period):
        df = self.df
        delta = df['Adj Close'].diff()

        up = delta * 0
        down = up.copy()

        up[delta > 0] = delta[delta > 0]
        down[delta < 0] = -delta[delta < 0]

        up[up.index[period - 1]] = np.mean(up[:period])
        up = up.drop(up.index[:(period - 1)])

        down[down.index[period - 1]] = np.mean(down[:period])
        down = down.drop(down.index[:(period - 1)])

        rs = up.ewm(span=period - 1).mean() / down.ewm(span=period - 1).mean()

        rsi_calc = 100 - 100 / (1 + rs)
        df['rsi'] = rsi_calc
        return df['rsi']


# pypl = Evaluation('PYPL', 'yahoo', start, end)
# print(csgs.df)
# print(csgs.simple_moving_average(50))
# print(csgs.exp_moving_average(26))
# print(csgs.rsi(14))


tickers = ['PYPL', 'TSLA']

for i in tickers:
    df = Evaluation(i, 'yahoo', start, end)
    df['SMA'] = df.simple_moving_average(50)
    df['EMA'] = df.exp_moving_average(26)
    df['rsi'] = df.rsi(14)
    print(df)

我收到一个 TypeError,我认为这与引用评估类有关。

TypeError: 'Evaluation' object does not support item assignment

标签: pythonpandasoop

解决方案


您将对象方法与数据框方法混淆了。在您的示例中,df是一个Evaluation对象,而不是数据框。

>>>e = Evaluation()                                                                                                  
>>>type(e)                                                                                                  
__main__.Evaluation

>>>type(e.df)                                                                                         
pandas.core.frame.DataFrame

该行df['SMA'] = df.simple_moving_average(50)失败,因为您无法向对象添加列。你需要使用df.df['SMA'] = df.simple_moving_average(50).

正如 NomadMonad 指出的那样,使用 df 作为评估对象的变量名会让人感到困惑,因此最好给它一个不同的名称。但是,eval是python中的内置函数,所以最好使用e.

此外,您应该出于多种原因更改课程设计

  • 在 python 3 中不需要继承自Object

  • __init__方法Security只有一个前导下划线而不是两个。

  • 你不想Evaluation继承自Security. 相反,在 的方法中传递一个Security对象。__init__Evaluation

  • 您不想在实例化对象时调用抓取网站的方法。对 pandas_datareader 的调用应该是一个单独的方法。

  • 如果在方法中设置了这些值,则不需要将参数传递给__init__方法。您可以使用 访问它们self

  • 您不需要在评估方法中修改基础数据框。而是返回该方法产生的值。

import datetime
import pandas as pd
import numpy as np
import pandas_datareader.data as web 


class Security:
    def _init__(self, ticker, data_platform, start_date, end_date):
        self.ticker = ticker
        self.data_platform = data_platform
        self.start_date = start_date
        self.end_date = end_date
        self.df = None

    def fetch_stock_data(self):
        self.df = web.DataReader(self.ticker, self.data_platform, self.start_date, self.end_date)


class Evaluation:

    def __init__(self, security):
        self.security = security

    def simple_moving_average(self, period):
        df = self.security.df
        return df['Adj Close'].rolling(period).mean()

    def exp_moving_average(self, period):
        df = self.security.df
        return df['Adj Close'].ewm(span=period).mean()

    def rsi(self, period):
        df = self.security.df
        delta = df['Adj Close'].diff()
        up = delta * 0
        down = up.copy()
        up[delta > 0] = delta[delta > 0]
        down[delta < 0] = -delta[delta < 0]
        up[up.index[period - 1]] = np.mean(up[:period])
        up = up.drop(up.index[:(period - 1)])
        down[down.index[period - 1]] = np.mean(down[:period])
        down = down.drop(down.index[:(period - 1)])
        rs = up.ewm(span=period - 1).mean() / down.ewm(span=period - 1).mean()
        return 100 - 100 / (1 + rs)


start = datetime.datetime(2019, 1, 1)
end = datetime.datetime(2019, 4, 17)

s = Security(ticker, 'yahoo', start, end)
e = Evaluation(security=s)

推荐阅读