首页 > 解决方案 > 从两个文件中按升序显示时间

问题描述

我有两个带有变量的文件及其各自的时间。我想在一个数据框中获取输出,其中变量和时间按升序排列。

evt_gts               evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O

下一个文件是:

timestamp            variable   
01-07-2019 13:25:03 RefSpd  
01-07-2019 13:25:10 EffRealized 
01-07-2019 13:25:30 ABHPosition 
01-07-2019 13:25:35 LinkVolt    
01-07-2019 13:25:36 BCPress 
01-07-2019 23:18:00 speed   
01-07-2019 23:18:05 temperature 
01-07-2019 23:31:00 speed   
01-07-2019 23:31:04 temperature 
01-07-2019 23:43:00 speed   
01-07-2019 23:43:05 temperature 

预期输出为:

timestamp     variable
01-07-2019 13:25:03 RefSpd  
01-07-2019 13:25:10 EffRealized 
01-07-2019 13:25:30 ABHPosition 
01-07-2019 13:25:35 LinkVolt    
01-07-2019 13:25:36 BCPress 
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
01-07-2019 23:18:00 speed   
01-07-2019 23:18:05 temperature 
01-07-2019 23:31:00 speed   
01-07-2019 23:31:04 temperature 
01-07-2019 23:43:00 speed   
01-07-2019 23:43:05 temperature 
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O

标签: pythonpandasdataframedatetime

解决方案


首先必须在两个 s 中设置相同的列名,DataFrame以便与 正确对齐rename,然后concat最后按timestamp列排序DataFrame.sort_values

df11 = df1.rename(columns={'evt_gts':'timestamp','evt_id':'variable'})
df = pd.concat([df11, df2], ignore_index=True).sort_values('timestamp')

如果两个 DataFrame 中的顺序/列数相同:

df1.columns = df2.columns
df = pd.concat([df1, df2], ignore_index=True).sort_values('timestamp')

print (df)
              timestamp     variable
5   01-07-2019 13:25:03       RefSpd
6   01-07-2019 13:25:10  EffRealized
7   01-07-2019 13:25:30  ABHPosition
8   01-07-2019 13:25:35     LinkVolt
9   01-07-2019 13:25:36      BCPress
0   01-07-2019 16:42:00      976162O
1   01-07-2019 16:42:30      976162O
10  01-07-2019 23:18:00        speed
11  01-07-2019 23:18:05  temperature
12  01-07-2019 23:31:00        speed
13  01-07-2019 23:31:04  temperature
14  01-07-2019 23:43:00        speed
15  01-07-2019 23:43:05  temperature
2   04-07-2019 15:03:20      976162O
3   04-07-2019 15:03:25      976162O
4   05-07-2019 10:20:00      976162O

编辑:

如果两个文件中的分隔符空格,那么解决方案会有所改变 - 想法是将列转换为datetimesread_csv并通过参数省略标题header=Nonekiprows=1

import pandas as pd
from io import StringIO

temp="""evt_gts evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename1.csv'
df1 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df1)
                  0_1        2
0 2019-01-07 16:42:00  976162O
1 2019-01-07 16:42:30  976162O
2 2019-04-07 15:03:20  976162O
3 2019-04-07 15:03:25  976162O
4 2019-05-07 10:20:00  976162O

temp="""timestamp variable
01-07-2019 13:25:03 RefSpd
01-07-2019 13:25:10 EffRealized
01-07-2019 13:25:30 ABHPosition
01-07-2019 13:25:35 LinkVolt
01-07-2019 13:25:36 BCPress
01-07-2019 23:18:00 speed
01-07-2019 23:18:05 temperature
01-07-2019 23:31:00 speed
01-07-2019 23:31:04 temperature
01-07-2019 23:43:00 speed
01-07-2019 23:43:05 temperature"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename2.csv'
df2 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df2)
                   0_1            2
0  2019-01-07 13:25:03       RefSpd
1  2019-01-07 13:25:10  EffRealized
2  2019-01-07 13:25:30  ABHPosition
3  2019-01-07 13:25:35     LinkVolt
4  2019-01-07 13:25:36      BCPress
5  2019-01-07 23:18:00        speed
6  2019-01-07 23:18:05  temperature
7  2019-01-07 23:31:00        speed
8  2019-01-07 23:31:04  temperature
9  2019-01-07 23:43:00        speed
10 2019-01-07 23:43:05  temperature

df = pd.concat([df1, df2], ignore_index=True).sort_values('0_1')
df.columns = ['timestamp', 'variable']
print (df)
             timestamp     variable
5  2019-01-07 13:25:03       RefSpd
6  2019-01-07 13:25:10  EffRealized
7  2019-01-07 13:25:30  ABHPosition
8  2019-01-07 13:25:35     LinkVolt
9  2019-01-07 13:25:36      BCPress
0  2019-01-07 16:42:00      976162O
1  2019-01-07 16:42:30      976162O
10 2019-01-07 23:18:00        speed
11 2019-01-07 23:18:05  temperature
12 2019-01-07 23:31:00        speed
13 2019-01-07 23:31:04  temperature
14 2019-01-07 23:43:00        speed
15 2019-01-07 23:43:05  temperature
2  2019-04-07 15:03:20      976162O
3  2019-04-07 15:03:25      976162O
4  2019-05-07 10:20:00      976162O

推荐阅读