首页 > 解决方案 > Python Dateutil:无法从两个日期计算年龄(相对增量)

问题描述

我正在尝试在数据框中创建一个新列,使用 Dateutil 的 relativedelta 函数计算一个人的年龄,使用以下代码;

df['Age'] = relativedelta(df['Today'], df['DOB']).years

但是,我收到以下错误;

ValueError                                Traceback (most recent call last)
<ipython-input-99-f87ca88a2e3c> in <module>()
      1 
----> 2 df['Years of Age2'] = relativedelta(df['Today'], df['DOB']).years

C:\anaconda3\lib\site-packages\dateutil\relativedelta.py in __init__(self, dt1, dt2, years, months, days, leapdays, weeks, hours, minutes, seconds, microseconds, year, month, day, weekday, yearday, nlyearday, hour, minute, second, microsecond)
    101                              "ambiguous and not currently supported.")
    102 
--> 103         if dt1 and dt2:
    104             # datetime is a subclass of date. So both must be date
    105             if not (isinstance(dt1, datetime.date) and

C:\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
    953         raise ValueError("The truth value of a {0} is ambiguous. "
    954                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955                          .format(self.__class__.__name__))
    956 
    957     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().



它在如下数据框之外是成功的;

DOB = datetime.date(1990,8,25)
Today = datetime.date.today()

relativedelta(Today, DOB).years

Out[2]: 29

==================================================== ====================

所以我假设我在将数据类型从数据帧传递给函数时做错了什么?

我可以使用下面的代码以不同的方式计算年龄,我只是不明白为什么第一种方法不起作用。

df['Years of Age'] = np.round((df['Today'] - df['DOB'])/np.timedelta64(1,'Y'),decimals = 0)

这是起始代码;


import pandas as pd
import numpy as np
import datetime
from dateutil.relativedelta import relativedelta 


ind = 'Andy Brandy Cindy'

MyDict = {"DOB" : [ (datetime.date(1954,7,5)),
                    (datetime.date(1998,1,27)),
                    (datetime.date(2001,3,15)) ]}

df = pd.DataFrame(data=MyDict,index=ind.split())

df['Today'] = datetime.date.today()

df


        DOB         Today
Andy    1954-07-05  2019-08-30
Brandy  1998-01-27  2019-08-30
Cindy   2001-03-15  2019-08-30

这是计算;

df['Age'] = relativedelta(df['Today'], df['DOB']).years

标签: pythonpython-3.xpandasdataframepython-dateutil

解决方案


我认为relativedelta不能接受 pandas Series 作为参数。回溯表明问题在于,当您的代码后面的代码relativedelta尝试检查dt1传递给的第一个参数的实例时relativedelta,您的代码是 Series df['Today']。然后从 pandas 引发了 value 错误,说检查 Series 是否为实例是模棱两可datetime.datetimeisinstance。正如您自己所做的那样,在数据框之外,它之所以有效,是因为您直接传递日期时间对象而不是系列。因此,您可以使用apply逐行计算 2 个日期时间对象之间的差异

df['Age'] = df.apply(lambda x: relativedelta(x['Today'], x['DOB']).years, axis=1)

但我认为您找到的解决方法更快,但可能不如使用relativedelta


推荐阅读