首页 > 解决方案 > 如何仅接受从最近的星期一开始的数据

问题描述

我有以下代码:

import pandas as pd
from pandas import datetime
from pandas import DataFrame as df
import matplotlib
from pandas_datareader import data as web
import matplotlib.pyplot as plt
import datetime
import numpy as np

stock = '^GSPC'
start = datetime.date(2000,1,1)
end = datetime.date.today()
data = web.DataReader(stock, 'yahoo',start, end)
data.index = pd.to_datetime(data.index, format ='%Y-%m-%d')
data['Day Name'] = data.index.weekday_name
data.set_index('day',append=True,inplace=True)
data.set_index('Day Name', append=True, inplace=True)
data['pct_day']= data['Adj Close'].pct_change()
df = data.groupby(['Day Name']).mean()
df = df.drop( index=['Saturday','Sunday'])
df = df.reindex(['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'])

我需要在这里做的是,无论输入的日期是什么,都应该只接受最近的星期一的数据,而在此之前什么都不能接受。在这里的任何帮助将不胜感激,在此先感谢。

标签: pythonpandasdatetime

解决方案


Monday您可以在 first with之前删除值Series.eqSeries.cumsum并获取所有不等于0by 的行Series.ne

data['Day Name'] = data.index.weekday_name

data = data[data['Day Name'].eq('Monday').cumsum().ne(0)].copy()

data.set_index('day',append=True,inplace=True)

另一个想法是Series.idxmax用于第一个索引值并传递给DataFrame.loc,数据中至少需要一个Monday

data['Day Name'] = data.index.weekday_name

data = data.loc[data['Day Name'].eq('Monday').idxmax():]

data.set_index('day',append=True,inplace=True)

样品

rng = pd.date_range('2017-04-01', periods=10)
data = pd.DataFrame({'a': range(10)}, index=rng)  

data['Day Name'] = data.index.weekday_name
data = data[data['Day Name'].eq('Monday').cumsum().ne(0)].copy()
print (data)
            a   Day Name
2017-04-03  2     Monday
2017-04-04  3    Tuesday
2017-04-05  4  Wednesday
2017-04-06  5   Thursday
2017-04-07  6     Friday
2017-04-08  7   Saturday
2017-04-09  8     Sunday
2017-04-10  9     Monday

推荐阅读