首页 > 解决方案 > 为什么它会更改我的日期时间列上的日/月位置

问题描述

我有以下数据:

Date;Time;GHI;DNI;DIF;flagR;SE;SA;TEMP;AP;RH;WS;WD;PWAT
01.01.1994;00:07;0;0;0;0;-41.92;-19.43;14.3;1004.4;93.4;0.3;189;17.7
01.01.1994;00:22;0;0;0;0;-40.65;-23.70;14.3;1004.4;93.6;0.1;186;17.8
01.01.1994;00:37;0;0;0;0;-39.14;-27.75;14.3;1004.3;93.7;0.0;10;18.0
01.01.1994;00:52;0;0;0;0;-37.43;-31.57;14.3;1004.1;93.7;0.2;4;18.2
...
31.08.2021;23:07;0;0;0;0;-75.77;-5.76;11.5;1004.0;71.8;0.8;261;9.0
31.08.2021;23:22;0;0;0;0;-74.99;-20.16;11.5;1003.8;71.2;1.1;237;8.3
31.08.2021;23:37;0;0;0;0;-73.45;-32.59;11.6;1003.7;70.5;1.5;224;7.5
31.08.2021;23:52;0;0;0;0;-71.33;-42.67;11.6;1003.5;69.9;2.0;217;6.6

我对其进行排序并为每年制作一个新的 csv 文件。但是,在每个新的 csv 文件中,我的 Date_Time 列会更改同一 csv 文件上的日/月顺序。

file = pd.read_csv('datta.csv', 
                    sep = ';', 
                    skiprows = 56,
                    parse_dates = [['Date','Time']])


years_pre = pd.to_datetime(file['Date_Time']).dt.year

#* Create an excel file time (15 min) for each year.
for year_XX in range(pd.to_datetime(file['Date_Time']).dt.year.min(), 
                     pd.to_datetime(file['Date_Time']).dt.year.max()+1):
    print(f"Creating file (15 min) for the year: {year_XX}")
    df_subset_15 = file[years_pre == year_XX]
    df_subset_15['Date_Time'] = df_subset_15['Date_Time'
                                    ].dt.strftime('%d/%m/%Y %H:%M')
    df_subset_15.to_csv(f"data_(15 min)_year_{year_XX}.csv")

输出:(-例如:1994)

2970,31/01/1994 22:37,0,0,0,0,-48.94,13.69,16.9,1001.2,85.9,1.7,251,15.6
2971,31/01/1994 22:52,0,0,0,0,-49.6,8.3,16.9,1001.2,86.2,1.7,250,15.6
2972,31/01/1994 23:07,0,0,0,0,-49.94,2.79,16.9,1001.3,86.4,1.7,249,15.6
2973,31/01/1994 23:22,0,0,0,0,-49.94,-2.77,16.9,1001.3,86.6,1.7,246,15.6
2974,31/01/1994 23:37,0,0,0,0,-49.6,-8.29,16.9,1001.3,86.9,1.7,244,15.7
2975,31/01/1994 23:52,0,0,0,0,-48.94,-13.68,16.9,1001.3,87.1,1.7,241,15.7
2976,02/01/1994 00:07,0,0,0,0,-48.25,-18.95,16.9,1001.3,87.4,1.7,239,15.7
2977,02/01/1994 00:22,0,0,0,0,-46.99,-23.92,17.0,1001.1,87.2,1.8,234,15.7
2978,02/01/1994 00:37,0,0,0,0,-45.46,-28.6,17.0,1001.0,86.9,1.9,229,15.7
2979,02/01/1994 00:52,0,0,0,0,-43.68,-32.98,17.1,1000.8,86.7,2.1,225,15.8
2980,02/01/1994 01:07,0,0,0,0,-41.7,-37.03,17.2,1000.6,86.5,2.2,222,15.8
2981,02/01/1994 01:22,0,0,0,0,-39.52,-40.79,17.2,1000.5,86.7,2.3,218,15.7
2982,02/01/1994 01:37,0,0,0,0,-37.19,-44.25,17.2,1000.3,86.9,2.4,215,15.7

您可以看到它是如何从 ' %d/%m /%Y %H:%M' 更改为 ' %m/%d /%Y %H:%M' 的再次改变。

标签: pythonpandasdataframedatetime

解决方案


dayfirst=True这是转换日期时间的必要添加read_csv

file = pd.read_csv('datta.csv', 
                    sep = ';', 
                    skiprows = 56,
                    dayfirst=True,
                    parse_dates = [['Date','Time']])

不同之处在于日期时间少于 13 天:

temp="""Date;Time
04.07.1994;00:07
01.01.1994;00:22
31.01.1994;00:37"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", dayfirst=True,parse_dates = [['Date','Time']])
    
print (df)
            Date_Time
0 1994-07-04 00:07:00
1 1994-01-01 00:22:00
2 1994-01-31 00:37:00

temp="""Date;Time
04.07.1994;00:07
01.01.1994;00:22
31.01.1994;00:37"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", parse_dates = [['Date','Time']])
    
print (df)
            Date_Time
0 1994-04-07 00:07:00
1 1994-01-01 00:22:00
2 1994-01-31 00:37:00

推荐阅读