首页 > 解决方案 > Python使用数据时间创建数据框

问题描述

我想以dataframe以下格式创建下个月的所有日期:

    StartDate            EndDate          StartTime           EndTime  
0   2021-05-01           2021-05-01       01:00               09:00
1   2021-05-01           2021-05-01       09:00               17:00
2   2021-05-01           2021-05-02       17:00               01:00
3   2021-05-02           2021-05-02       01:00               09:00
3   2021-05-02           2021-05-02       09:00               17:00
4   2021-05-02           2021-05-03       17:00               01:00 
5   2021-05-03           2021-05-03       01:00               09:00
6   2021-05-03           2021-05-03       09:00               17:00
7   2021-05-03           2021-05-04       17:00               01:00

代码需要生成dataframe所有日期(例如 5 月,2021-05-01 - 2021-05-31)。

例如,如果我在 2 月运行代码,dataframe则将在 3 月运行。

StartTime 和 EndTime 按照dataframe.

这是我的代码:

import pandas as pd
import numpy as np
import datetime

## Check current year and next month

now = datetime.datetime.now()
current_year = now.year
current_month = now.month
next_month = current_month + 1

## Check how many days in the next month

class Solution(object):
   def numberOfDays(self, y, m):
      leap = 0
      if y% 400 == 0:
         leap = 1
      elif y % 100 == 0:
         leap = 0
      elif y% 4 == 0:
         leap = 1
      if m==2:
         return 28 + leap
      list = [1,3,5,7,8,10,12]
      if m in list:
         return 31
      return 30
ob1 = Solution()
num_days_in_month = ob1.numberOfDays(current_year, next_month)
next_date = "'"+ str(current_year) + '-' +str(next_month)+ '-01'"'"

# create an array with the number of the days in month
rng = pd.date_range(next_date, periods=num_days_in_month, freq='D')
df = pd.DataFrame({ 'StartDate': rng, 'EndDate' : rng})

df输出是:

    StartDate    EndDate StartTime EndTime
0  2015-05-01 2015-05-01                  
1  2015-05-02 2015-05-02                  
2  2015-05-03 2015-05-03                  
3  2015-05-04 2015-05-04                  
4  2015-05-05 2015-05-05                  
5  2015-05-06 2015-05-06                  
6  2015-05-07 2015-05-07                  
7  2015-05-08 2015-05-08                  
8  2015-05-09 2015-05-09                  
9  2015-05-10 2015-05-10                  
10 2015-05-11 2015-05-11                  
11 2015-05-12 2015-05-12                  
12 2015-05-13 2015-05-13                  
13 2015-05-14 2015-05-14                  
14 2015-05-15 2015-05-15                  
15 2015-05-16 2015-05-16                  
16 2015-05-17 2015-05-17                  
17 2015-05-18 2015-05-18                  
18 2015-05-19 2015-05-19                  
19 2015-05-20 2015-05-20                  
20 2015-05-21 2015-05-21                  
21 2015-05-22 2015-05-22                  
22 2015-05-23 2015-05-23                  
23 2015-05-24 2015-05-24                  
24 2015-05-25 2015-05-25                  
25 2015-05-26 2015-05-26                  
26 2015-05-27 2015-05-27                  
27 2015-05-28 2015-05-28                  
28 2015-05-29 2015-05-29                  
29 2015-05-30 2015-05-30                  
30 2015-05-31 2015-05-31                  

我正在尝试获取我指定的格式。

标签: pythondataframe

解决方案


我使用了一个itertools 配方,还num_days_in_month通过使用calendar.monthrange.

import pandas as pd
import numpy as np
import datetime
import calendar
import itertools

now = datetime.datetime.now()
current_year = now.year
current_month = now.month
next_month = current_month + 1
num_days_in_month = calendar.monthrange(current_year, next_month)[1]
next_date = f"'{current_year}-{next_month}-01"
rng = pd.date_range(next_date, periods=num_days_in_month, freq='D')

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = itertools.tee(iterable)
    next(b, None)
    return zip(a, b)

dates = itertools.chain.from_iterable(itertools.repeat(date, 3) for date in rng)
times = itertools.cycle(['01:00', '09:00', '17:00'])

paired_dates = pairwise(dates)
paired_times = pairwise(times)

df = pd.DataFrame([(start_date, end_date, start_time, end_time) for (start_date, end_date), (start_time, end_time) in zip(paired_dates, paired_times)],
               columns =['StartDate', 'EndDate', 'StartTime', 'EndTime'])
开始日期 结束日期 开始时间 时间结束
0 2021-05-01 2021-05-01 01:00 09:00
1 2021-05-01 2021-05-01 09:00 17:00
2 2021-05-01 2021-05-02 17:00 01:00
3 2021-05-02 2021-05-02 01:00 09:00
4 2021-05-02 2021-05-02 09:00 17:00
5 2021-05-02 2021-05-03 17:00 01:00
6 2021-05-03 2021-05-03 01:00 09:00
7 2021-05-03 2021-05-03 09:00 17:00
8 2021-05-03 2021-05-04 17:00 01:00
9 2021-05-04 2021-05-04 01:00 09:00
... ... ... ... ...
81 2021-05-28 2021-05-28 01:00 09:00
82 2021-05-28 2021-05-28 09:00 17:00
83 2021-05-28 2021-05-29 17:00 01:00
84 2021-05-29 2021-05-29 01:00 09:00
85 2021-05-29 2021-05-29 09:00 17:00
86 2021-05-29 2021-05-30 17:00 01:00
87 2021-05-30 2021-05-30 01:00 09:00
88 2021-05-30 2021-05-30 09:00 17:00
89 2021-05-30 2021-05-31 17:00 01:00
90 2021-05-31 2021-05-31 01:00 09:00
91 2021-05-31 2021-05-31 09:00 17:00

推荐阅读