首页 > 解决方案 > 按日期过滤时,并非所有日期都被捕获。蟒蛇熊猫

问题描述

我正在按日期过滤数据框以生成两个单独的版本:

  1. 仅今天日期的数据
  2. 最近两年的数据

但是,当我尝试过滤日期时,它似乎错过了过去两年内的日期。

date_format = '%m-%d-%Y'  # desired date format

today = dt.now().strftime(date_format)  # today's date. Will always result in today's date
today = dt.strptime(today, date_format).date()  # converting 'today' into a datetime object

today = today.strftime(date_format)
two_years = today - relativedelta(years=2)  # date is today's date minus two years. 
two_years = two_years.strftime(date_format)

# normalizing the format of the date column to the desired format 
df_data['date'] = pd.to_datetime(df_data['date'], errors='coerce').dt.strftime(date_format)

df_today = df_data[df_data['date'] == today]
df_two_year = df_data[df_data['date'] >= two_years]

结果是:

all dates ['07-17-2020' '07-15-2020' '08-01-2019' '03-25-2015']
today df ['07-17-2020']
two year df ['07-17-2020' '08-01-2019']

即使捕获了 08-01-2019,两年中也缺少 07-15-2020 日期。

标签: pythonpandasdatetimetimedeltarelativedelta

解决方案


您的数据类型转换是这里的问题。你可以这样做:

today = dt.now()  # today's date. Will always result in today's date
two_years = today - relativedelta(years=2)  # date is today's date minus two years. 

这将打印“2018-07-17 18:40:42.704395”。然后,您可以将其转换为仅日期格式。

two_years = two_years.strftime(date_format)
two_years = dt.strptime(two_years, date_format).date()

推荐阅读