首页 > 解决方案 > python中带有日期和空值的or子句

问题描述

我有以下df:

   date_from    date_to      birth_date    death_date
0  2016-01-10   2019-06-05   2015-02-15    2018-07-25
1  2016-05-11   2020-06-13   2014-03-07    2020-07-11
2  2016-02-23   Nat          2014-03-07    2019-06-08
3  2015-12-08   Nat          2014-03-07    2019-06-08

我正在尝试选择 date_to > death_date 或 date_to = Nat 的所有情况。

我试过以下代码:

df = df[(df['date_to'] > df['death_date']) | (df[df['DATE_TO'].isnull()])]

但我收到以下错误消息

'TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]'

而且我真的不知道如何解决这个问题。

标签: pythonpandas

解决方案


从你的问题

import pandas as pd
# ..... your data frame df ......

# considering that you have the following types

>>> df.dtypes
date_from     datetime64[ns]
date_to       datetime64[ns]
birth_date    datetime64[ns]
death_date    datetime64[ns]
dtype: object  

df = df[(df['date_to'] > df['death_date']) | (df['date_to'].isnull())]

>>> df
date_from    date_to birth_date death_date
0 2016-01-10 2019-06-05 2015-02-15 2018-07-25
2 2016-02-23        NaT 2014-03-07 2019-06-08
3 2015-12-08        NaT 2014-03-07 2019-06-08

如果 date_to 列不是日期时间,您可以像这样转换

df['date_to'] = df['date_to'].replace('Nat', pd.NaT)
df['date_to'] = pd.to_datetime(df['date_to'])

推荐阅读