首页 > 解决方案 > 使用带有分隔符的python分析文本日志文件

问题描述

我有这个日志文本文件:

2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Connection id - 0
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Start Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Statement create or replace table amit (x date);
2018-11-06 16:52:01.817| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0)| admin|  Connection id - 0 - Executing - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Connection id - 0
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  End Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  SQL - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Success
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Connection id - 0
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Start Time - 2018-11-06 16:52:14
2018-11-06 16:52:14.918| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Statement create or replace table amit (x int, y int);
2018-11-06 16:52:14.925| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1)| admin|  Connection id - 0 - Executing - create or replace table amit (x int, y int);
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Connection id - 0
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  End Time - 2018-11-06 16:52:15
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  SQL - create or replace table amit (x int, y int);
3:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Connection id - 10
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Start Time - 2018-11-06 16:53:25
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Statement select 1;
2018-11-06 16:53:25.954| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15)| admin|  Connection id - 10 - Executing - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Connection id - 10
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  End Time - 2018-11-06 16:53:26
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  SQL - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  ResultSet Num Of Rows - 1
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Processed Num Of Rows - 2
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Success
2018-11-06 16:52:38.761| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| admin| (stmt : 10) | admin|  Failed
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Connection id - 0
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Start Time - 2018-11-06 16:52:54
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Statement delete from amit where y = 111111;
2018-11-06 16:52:54.178| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11)| admin|  Connection id - 0 - Executing - delete from amit where y = 111111;
2018-11-06 16:52:54.217| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Connection id - 0

此文本文件分为几个部分,分隔符为“|” 不同的部分如下:

  1. 声明的日期时间。(例如:2018-11-06 16:52:01.782)
  2. 线程号。(例如:140447603222272)
  3. 运行语句的用户 IP + 端口。(例如:192.168.0.244:5000)
  4. 用户运行语句的数据库。(例如:master)
  5. 运行语句的用户 IP。(示例:192.168.0.244)
  6. 运行语句的用户的用户名。(例如:omer)
  7. 用户执行的语句的语句 ID。(例如:15)
  8. 服务名称。(sqream)
  9. 信息列包括 - 语句成功、语句本身、连接 ID、语句的开始/结束时间、返回的行数。(动态变化)

我想以一种可以过滤掉此类消息的方式分析文本文件:用户发送了多少成功语句?(通过“成功”成功通过的每个语句

每个用户向服务器发送了多少失败/成功的语句?“成功”/“失败”

根据日志,用户总共发送了多少条语句

我已经实现了以下代码:

def parse_log_file(log_file):
    print(len(""))
    my_path = os.path.abspath(os.path.dirname(__file__))
    path = os.path.join(my_path, log_file)
    with open(path, 'r') as f:
        lines = f.readlines()[1:]
        for line in lines:
            elements = line.strip().split('|')
            print(elements, len(elements))

我正在尝试继续分析文件,但以有效的方式,我是 python 的新手,我试着理解我上面提到的。我正在考虑使用正则表达式或使用元组来将这些数据保存在键/值设计模式中。

标签: pythontuplestext-files

解决方案


推荐阅读