regex - 使用 regex-python3.x 获取时间戳
问题描述
将所有时间戳与文本文件中存在的其他内容分开。例如:
a.txt
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
"mgremove datestring" asfasnfs: remove datepart check the value
"mgremove datestring" asfasnfs: remove datepart check the value
我的解决方案对文本中的前 4 行执行此操作,但它不是通用的。我想让它通用,以便它从行的开头自动检测时间戳。
with open("\a.txt") as f:
for line in f:
date_string = " ".join(line.strip().split()[:4])
print(date_sting, line)
预期的解决方案:
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = asfasnfs: remove datepart
文本文件也可能包含其他时间戳模式。有没有办法检测行首的时间戳并获取它?如果行首没有日期,则取最后一行的日期。
解决方案
包含以下内容a.txt
:
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
asfasnfs: remove datepart
asfasnfs: remove datepart
这个脚本:
def get_date_string(line):
rv = ''
words = line.split()
while words:
rv += words.pop(0) + ' '
if len(rv) > 18:
break
return rv.strip()
with open('file.txt', 'r') as f_in:
last_date_string = ''
for line in f_in:
line = line.strip()
if not line:
continue
date_part = get_date_string(line)
if date_part == line:
print('date string={: <30} line={}'.format(last_date_string, line))
else:
print('date string={: <30} line={}'.format(date_part, line))
last_date_string = date_part
印刷:
date string=2019/01/31-11:56:23.288258 line=2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258 line=2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258 line=2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258 line=2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z" line="2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z" line="2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z" line="2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z" line="2019-07-17T07:11:14.894Z" "mgremove datestring" asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10 line=17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10 line=17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10 line=17 Jul 2019 07:01:10 "mgremove datestring" asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10 line=asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10 line=asfasnfs: remove datepart
推荐阅读
- python - 数据框中特定行的总和(熊猫)
- javascript - 按日期按升序对表格进行排序
- android - 离子 3 错误 404 APK
- python - 姜戈 (?P
\d+) 对比 - ios - NSTimeInterval 并查看日期是否超过一个小时
- openwrt - 如何在 OpenWRT 映像中包含 Python pip3 模块
- javascript - Google Picker API iframe 不可见
- scala - 如何在事先不知道类型的情况下动态构造 Scala 类?
- colors - Sublime Text 3 - 更新后空白颜色错误
- javascript - Javascript 正则表达式范围 (0-255)