首页 > 解决方案 > 大熊猫中日期时间列的日期级别的groupby聚合

问题描述

我有一个如下所示的数据框。这是一个医生预约数据。

  Doctor     Appointment              Show
  A          2020-01-18 12:00:00      Yes
  A          2020-01-18 12:30:00      Yes
  A          2020-01-18 13:00:00      No
  A          2020-01-18 13:30:00      Yes
  B          2020-01-18 12:00:00      Yes
  B          2020-01-18 12:30:00      Yes
  B          2020-01-18 13:00:00      No
  B          2020-01-18 13:30:00      Yes
  B          2020-01-18 16:00:00      No
  B          2020-01-18 16:30:00      Yes
  A          2020-01-19 12:00:00      Yes
  A          2020-01-19 12:30:00      Yes
  A          2020-01-19 13:00:00      No
  A          2020-01-19 13:30:00      Yes
  A          2020-01-19 14:00:00      Yes
  A          2020-01-19 14:30:00      No
  A          2020-01-19 16:00:00      No
  A          2020-01-19 16:30:00      Yes
  B          2020-01-19 12:00:00      Yes
  B          2020-01-19 12:30:00      Yes
  B          2020-01-19 13:00:00      No
  B          2020-01-19 13:30:00      Yes
  B          2020-01-19 14:00:00      No
  B          2020-01-19 14:30:00      Yes
  B          2020-01-19 15:00:00      No
  B          2020-01-18 15:30:00      Yes

从上面的数据框中,我想在 pandas 中创建一个函数,它将输出以下内容。

我在下面试过

def Doctor_date_summary(doctor, date):
   Number of slots = df.groupby([doctor, date] ).sum()

预期输出:

Doctor_date_summary(Doctor, date)
If Doctor = A, date = 2020-01-19

Number of slots = 8
Number of show up = 5
show up percentage = 62.5

其中,该医生在该日期的显示列中是数 = 5

标签: pandaspandas-groupby

解决方案


您可以先从这里创建一个日期列:

df['day'] = df['Appointment'].dt.floor('d')

然后你可以使用布尔索引:

def Doctor_date_summary(Doctor, date):
    number_of_show_up = np.sum((df['Doctor']==Doctor) & (df['day']==date) & (df['Show']=='Yes'))
    number_of_slots = np.sum((df['Doctor']==Doctor) & (df['day']==date))

    return number_of_show_up, number_of_slots, 100*number_of_show_up/number_of_slots

最后:

number_of_show_up, number_of_slots, percentage = Doctor_date_summary('A', '2020-01-19')

print("Number of slots = {}".format(number_of_slots))
print("Number of show up = {}".format(number_of_show_up))
print("show up percentage = {:.1f}".format(percentage))

Number of slots = 8
Number of show up = 5
show up percentage = 62.5

推荐阅读