python - 从两个文件中按升序显示时间
问题描述
我有两个带有变量的文件及其各自的时间。我想在一个数据框中获取输出,其中变量和时间按升序排列。
evt_gts evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O
下一个文件是:
timestamp variable
01-07-2019 13:25:03 RefSpd
01-07-2019 13:25:10 EffRealized
01-07-2019 13:25:30 ABHPosition
01-07-2019 13:25:35 LinkVolt
01-07-2019 13:25:36 BCPress
01-07-2019 23:18:00 speed
01-07-2019 23:18:05 temperature
01-07-2019 23:31:00 speed
01-07-2019 23:31:04 temperature
01-07-2019 23:43:00 speed
01-07-2019 23:43:05 temperature
预期输出为:
timestamp variable
01-07-2019 13:25:03 RefSpd
01-07-2019 13:25:10 EffRealized
01-07-2019 13:25:30 ABHPosition
01-07-2019 13:25:35 LinkVolt
01-07-2019 13:25:36 BCPress
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
01-07-2019 23:18:00 speed
01-07-2019 23:18:05 temperature
01-07-2019 23:31:00 speed
01-07-2019 23:31:04 temperature
01-07-2019 23:43:00 speed
01-07-2019 23:43:05 temperature
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O
解决方案
首先必须在两个 s 中设置相同的列名,DataFrame
以便与 正确对齐rename
,然后concat
最后按timestamp
列排序DataFrame.sort_values
:
df11 = df1.rename(columns={'evt_gts':'timestamp','evt_id':'variable'})
df = pd.concat([df11, df2], ignore_index=True).sort_values('timestamp')
如果两个 DataFrame 中的顺序/列数相同:
df1.columns = df2.columns
df = pd.concat([df1, df2], ignore_index=True).sort_values('timestamp')
print (df)
timestamp variable
5 01-07-2019 13:25:03 RefSpd
6 01-07-2019 13:25:10 EffRealized
7 01-07-2019 13:25:30 ABHPosition
8 01-07-2019 13:25:35 LinkVolt
9 01-07-2019 13:25:36 BCPress
0 01-07-2019 16:42:00 976162O
1 01-07-2019 16:42:30 976162O
10 01-07-2019 23:18:00 speed
11 01-07-2019 23:18:05 temperature
12 01-07-2019 23:31:00 speed
13 01-07-2019 23:31:04 temperature
14 01-07-2019 23:43:00 speed
15 01-07-2019 23:43:05 temperature
2 04-07-2019 15:03:20 976162O
3 04-07-2019 15:03:25 976162O
4 05-07-2019 10:20:00 976162O
编辑:
如果两个文件中的分隔符空格,那么解决方案会有所改变 - 想法是将列转换为datetime
sread_csv
并通过参数省略标题header=None
,kiprows=1
:
import pandas as pd
from io import StringIO
temp="""evt_gts evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename1.csv'
df1 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df1)
0_1 2
0 2019-01-07 16:42:00 976162O
1 2019-01-07 16:42:30 976162O
2 2019-04-07 15:03:20 976162O
3 2019-04-07 15:03:25 976162O
4 2019-05-07 10:20:00 976162O
temp="""timestamp variable
01-07-2019 13:25:03 RefSpd
01-07-2019 13:25:10 EffRealized
01-07-2019 13:25:30 ABHPosition
01-07-2019 13:25:35 LinkVolt
01-07-2019 13:25:36 BCPress
01-07-2019 23:18:00 speed
01-07-2019 23:18:05 temperature
01-07-2019 23:31:00 speed
01-07-2019 23:31:04 temperature
01-07-2019 23:43:00 speed
01-07-2019 23:43:05 temperature"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename2.csv'
df2 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df2)
0_1 2
0 2019-01-07 13:25:03 RefSpd
1 2019-01-07 13:25:10 EffRealized
2 2019-01-07 13:25:30 ABHPosition
3 2019-01-07 13:25:35 LinkVolt
4 2019-01-07 13:25:36 BCPress
5 2019-01-07 23:18:00 speed
6 2019-01-07 23:18:05 temperature
7 2019-01-07 23:31:00 speed
8 2019-01-07 23:31:04 temperature
9 2019-01-07 23:43:00 speed
10 2019-01-07 23:43:05 temperature
df = pd.concat([df1, df2], ignore_index=True).sort_values('0_1')
df.columns = ['timestamp', 'variable']
print (df)
timestamp variable
5 2019-01-07 13:25:03 RefSpd
6 2019-01-07 13:25:10 EffRealized
7 2019-01-07 13:25:30 ABHPosition
8 2019-01-07 13:25:35 LinkVolt
9 2019-01-07 13:25:36 BCPress
0 2019-01-07 16:42:00 976162O
1 2019-01-07 16:42:30 976162O
10 2019-01-07 23:18:00 speed
11 2019-01-07 23:18:05 temperature
12 2019-01-07 23:31:00 speed
13 2019-01-07 23:31:04 temperature
14 2019-01-07 23:43:00 speed
15 2019-01-07 23:43:05 temperature
2 2019-04-07 15:03:20 976162O
3 2019-04-07 15:03:25 976162O
4 2019-05-07 10:20:00 976162O
推荐阅读
- nlp - spacy-lookup 标点符号干扰
- c# - 在我的 .Net 标准库项目中,System.Configuration.ConfigurationManager 似乎不起作用
- java - Tomcat 不会从 HTTP 重定向到 HTTPS
- c# - 在 C# 中指定泛型类型参数时如何使用“动态”?
- git - 将所有文件推送到我的 github 后如何修复 git 提交的身份不明的用户
- sql - 连接两个表检查一个表的值以显示第二个表的结果
- ruby-on-rails - RubyMine 的运行配置在 Ubuntu 上导致“Yarn 需要 Node.js 4.0”的错误
- python - 从每个列表中选择一项,最多N个组合,均匀分布
- java - 在 JavaFX 中观看 Twitch 直播
- amazon-web-services - 以“虚拟托管样式”格式生成 S3 URL