首页 > 解决方案 > 将一年中的某一天转换为日期时间

问题描述

我有一个包含年、年(DOY)、小时和分钟的数据文件,如下所示:

                         BuoyID  Year  Hour  Min       DOY   POS_DOY     Lat     Lon     Ts

            0      300234065718160  2019     7    0  216.2920  216.2920  58.559 -23.914  14.61

            1      300234065718160  2019     9    0  216.3750  216.3750  58.563 -23.905  14.60

            2      300234065718160  2019    10    0  216.4170  216.4170  58.564 -23.903  14.60

            3      300234065718160  2019    11    0  216.4580  216.4580  58.563 -23.906  14.60

            4      300234065718160  2019    12    0  216.5000  216.5000  58.561 -23.910  14.60

为了制作我的约会时间,我使用了:

dt_raw = pd.to_datetime(df_buoy['Year'] * 1000 + df_buoy['DOY'], format='%Y%j')

# Convert to datetime
dt_buoy = [d.date() for d in dt_raw]
date = datetime.datetime.combine(dt_buoy[0], datetime.time(df_buoy.Hour[0], df_buoy.Min[0]))

当时间不是int而是float时,我的问题就出现了。例如:

                   BuoyID  Year   Hour  Min      DOY  POS_DOY       Lat       Lon      BP    Ts
          0    300234061876910  2014  23.33    0  226.972  226.972  71.93081 -141.0792  1016.9 -0.01
          1    300234061876910  2014  23.50    0  226.979  226.979  71.93020 -141.0826  1016.8  3.36
          2    300234061876910  2014  23.67    0  226.986  226.986  71.92968 -141.0856  1016.8  3.28
          3    300234061876910  2014  23.83    0  226.993  226.993  71.92934 -141.0876  1016.8  3.22
          4    300234061876910  2014   0.00    0  227.000  227.000  71.92904 -141.0894  1016.8  3.18   

我试图做的是转换str中的小时,获取前两个索引,从而获得小时,然后从“小时”中减去它并乘以 60 以获得分钟。

int_hour = [(int(str(i)[0:2])) for i in df_buoy.Hour]  
minutes = map(lambda x, y: (x - y)*60, df_buoy.Hour, int_hour)

但是,当然,如果你有“0”。在你的时间里,Python 会抱怨:

ValueError: invalid literal for int() with base 10: '0.'  

我的问题是:有谁知道以简单的方式将年份、DOY、小时(int或 *float)和分钟转换为日期时间的简单方法?

标签: pythonpandasdatetime

解决方案


用于to_timedelta转换小时列并添加到日期时间,适用于整数和浮点数:

df['d'] = (pd.to_datetime(df['Year'] * 1000 + df['DOY'], format='%Y%j') +
           pd.to_timedelta(df['Hour'], unit='h'))

print (df)
            BuoyID  Year  Hour  Min      DOY  POS_DOY     Lat     Lon     Ts  \
0  300234065718160  2019     7    0  216.292  216.292  58.559 -23.914  14.61   
1  300234065718160  2019     9    0  216.375  216.375  58.563 -23.905  14.60   
2  300234065718160  2019    10    0  216.417  216.417  58.564 -23.903  14.60   
3  300234065718160  2019    11    0  216.458  216.458  58.563 -23.906  14.60   
4  300234065718160  2019    12    0  216.500  216.500  58.561 -23.910  14.60   

                    d  
0 2019-08-04 07:00:00  
1 2019-08-04 09:00:00  
2 2019-08-04 10:00:00  
3 2019-08-04 11:00:00  
4 2019-08-04 12:00:00  

df['d'] = (pd.to_datetime(df['Year'] * 1000 + df['DOY'], format='%Y%j') +
           pd.to_timedelta(df['Hour'], unit='h'))

print (df)
            BuoyID  Year   Hour  Min      DOY  POS_DOY       Lat       Lon  \
0  300234061876910  2014  23.33    0  226.972  226.972  71.93081 -141.0792   
1  300234061876910  2014  23.50    0  226.979  226.979  71.93020 -141.0826   
2  300234061876910  2014  23.67    0  226.986  226.986  71.92968 -141.0856   
3  300234061876910  2014  23.83    0  226.993  226.993  71.92934 -141.0876   
4  300234061876910  2014   0.00    0  227.000  227.000  71.92904 -141.0894   

       BP    Ts                   d  
0  1016.9 -0.01 2014-08-14 23:19:48  
1  1016.8  3.36 2014-08-14 23:30:00  
2  1016.8  3.28 2014-08-14 23:40:12  
3  1016.8  3.22 2014-08-14 23:49:48  
4  1016.0   NaN 2014-08-15 00:00:00  

推荐阅读