首页 > 解决方案 > 如何找到股票每个交易日的低点?

问题描述

所以我有一个包含 Microsoft 分钟股票数据的 csv。我试图找到每个交易日的低点。代码如下所示:

ticker='MSFT'
df = pd.read_csv('/Volumes/Seagate Portable/S&P 500 List/{}.txt'.format(ticker))
df.columns = ['Extra', 'Dates', 'Open', 'High', 'Low', 'Close', 'Volume']
df.Dates = pd.to_datetime(df.Dates)
df.set_index(df.Dates, inplace=True)
df.drop(['Extra', 'High', 'Volume', 'Dates', 'Open'], axis=1, inplace=True)
df = df.between_time('9:30', '16:00')
df['Low'] = df.Low.groupby(by=[df.index.day]).min()
df

输出是:

                     Low    Close
Dates       
2020-01-02 09:30:00 NaN 158.610
2020-01-02 09:31:00 NaN 158.380
2020-01-02 09:32:00 NaN 158.620
2020-01-02 09:33:00 NaN 158.692
2020-01-02 09:34:00 NaN 158.910
... ... ...
2020-12-18 15:56:00 NaN 218.700
2020-12-18 15:57:00 NaN 218.540
2020-12-18 15:58:00 NaN 218.710
2020-12-18 15:59:00 NaN 218.150
2020-12-18 16:00:00 NaN 218.500

所以问题是低点充满了 NaN 值,我认为这是因为我错过了使用 groupby。我也试过:

ticker='MSFT'
df = pd.read_csv('/Volumes/Seagate Portable/S&P 500 List/{}.txt'.format(ticker))
df.columns = ['Extra', 'Dates', 'Open', 'High', 'Low', 'Close', 'Volume']
df.Dates = pd.to_datetime(df.Dates)
df.set_index(df.Dates, inplace=True)
df.drop(['Extra', 'High', 'Volume', 'Dates', 'Open'], axis=1, inplace=True)
df = df.between_time('9:30', '16:00')
df = df.groupby(by=[df.index.day]).min()
df

对此的输出是:

         Low    Close
Dates       
1   150.8200    150.9800
2   150.3600    150.8400
3   152.1900    152.2800
4   165.6200    165.7000
5   165.6900    165.8200
6   156.0000    156.0700
7   157.3200    157.3500
8   157.9491    158.0000
9   150.0000    150.2700
10  152.5800    152.7950
11  151.1500    151.1930
12  138.5800    138.7600
13  140.7300    140.8700
14  161.7200    161.7500
15  162.5700    162.6300
16  135.0000    135.3300
17  135.0000    135.3400
18  135.0200    135.2600
19  139.0000    139.1300
20  135.8600    136.5900
21  166.1102    166.2100
22  165.6800    165.6900
23  132.5200    132.7100
24  141.2700    141.6481
25  144.4400    144.8102
26  148.3700    149.7000
27  149.2000    149.2700
28  152.0000    153.8152
29  165.6900    165.7952
30  150.0100    152.7200
31  156.5600    157.0450

问题在于它正在寻找收盘价和开盘价的低点。此外,总共只有 31 行,尽管应该有更多行,因为这是 2020 年全年的数据集。我假设这样做我分组错误,因为我查看了前 31 天每天的收盘价,并且有不可能这些都是那些日子的最低点。所以问题是我怎样才能找到每天的低点,而不影响收盘列,并避免上述问题?

标签: pythonpandasdataframedatetimeindexing

解决方案


尝试这个:

unique_dates = list(set([str(date).split()[0] for date in df.index]))

min_values_daily = [min(df.loc[df.index==date].Close) for date in unique_dates] 

最后,创建一个新的数据框:

low_data = pd.DataFrame({
     'date': unique_dates,
     'low': min_values_daily
})

推荐阅读