首页 > 解决方案 > 如何在python中将总天数除以单个年份,以便总天数不会影响某一特定年份

问题描述

我目前正在分析对每个部门的请求的响应延迟。数据格式如下:

Department     RequestDate     ResponseDate 
Electronics    2019-05-01      2019-09-19
Babyshop       2018-08-02      2019-09-30
Grocery        2016-01-01      2018-01-01
Pharmacy       2015-03-01      2018-03-01

我试图完成的是将总天数划分为相应的年份。预期输出如下:

Department     RequestDate     ResponseDate   2015  2016  2017  2018  2019    TotalDays
Electronics    2019-05-01      2019-09-19      0      0    0     0     149     149
Babyshop       2018-08-02      2019-09-30      0      0    0     152   272     424
Grocery        2016-01-01      2018-01-01      0      365  365   1     0       731
Pharmacy       2015-03-01      2018-03-01      306    365  365   60    0       1096

目前我的工作流程在 excel 中,而且很整洁。有什么方法可以利用 python 函数。

标签: pythonfunctiondataframedatetime

解决方案


我已尽力在解决方案中包含每个边界条件。就索引而言,我认为您可以解决这个问题。

import calendar as cd
df = pd.DataFrame(columns=['RequestDate','ResponseDate'])
df.RequestDate = [pd.Timestamp('2019-05-01'), pd.Timestamp('2018-08-02'), pd.Timestamp('2016-01-01'),pd.Timestamp('2015-03-01')]
df.ResponseDate = [pd.Timestamp('2019-09-19'), pd.Timestamp('2019-09-30'), pd.Timestamp('2018-01-01'),pd.Timestamp('2018-03-01')]


df['TotalDays']=(df.ResponseDate-df.RequestDate).dt.days+1  #This is done coz it
#  seems in sample data, that the day corresponding to **ResponseDate**
# has also been counted when it comes to number of days for each years 
year_min = df['RequestDate'].min().year
year_max = df['ResponseDate'].max().year
years = [i for i in range(year_min,year_max+1)]


for i in years:
    df[i]=0
df.columns=['RequestDate','ResponseDate', 'TotalDays', *years]
l=[]


for i in range(len(years)-1):
    z=[]
    for item, row in df.iterrows():
        row[years[i]] = (min(row['ResponseDate'], pd.Timestamp(f'{years[i]+1}-01-01'))-max(row['RequestDate'], pd.Timestamp(f'{years[i]-1}-12-31'))).days
        if cd.isleap(years[i])==True:
            if row[years[i]]<=0:
                row[years[i]]=0
            elif row[years[i]]>366:
                row[years[i]]=366
        else:
            if row[years[i]]<=0:
                row[years[i]]=0
            elif row[years[i]]>365:
                row[years[i]]=365

        z.append(row[years[i]])
    l.append(z)


for i in range(len(years)-1):
    df[years[i]]=l[i]
df[years[-1]]=df['TotalDays']-df.iloc[:, 3:-1].sum(axis=1)
df=df[['RequestDate','ResponseDate',*years,'TotalDays']]
df

可能有更好的答案,但我想不出。这对您的所有情况都有效吗?


推荐阅读