首页 > 解决方案 > 将日志解析为 json Python

问题描述

伙计们,

我正在尝试将日志文件解析为 json 格式。

我有很多日志,其中有一个
我该如何解析?

03:02:03.113 [info]  ext_ref = BANK24AOS_cl_reqmarketcreditorderstate_6M8I1NT8JKYD_1591844522410384_4SGA08M8KIXQ reqid = 1253166 type = INREQ channel = BANK24AOS sid = msid_1591844511335516_KRRNBSLH2FS duration = 703.991 req_uri = marketcredit/order/state login = 77012221122 req_type = cl_req req_headers = {"accept-encoding":"gzip","connection":"close","host":"test-mobileapp-api.bank.kz","user-agent":"okhttp/4.4.1","x-forwarded-for":"212.154.169.134","x-real-ip":"212.154.169.134"} req_body = {"$sid":"msid_1591844511335516_KRRNBSLH2FS","$sid":"msid_1591844511335516_KRRNBSLH2FS","app":"bank","app_version":"2.3.2","channel":"aos","colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv","colvir_commercial_id":"-1","colvir_id":"000120.335980","openway_commercial_id":"6247520","openway_id":"6196360","$lang":"ru","ekb_id":"923243","inn":"990830221722","login":"77012221122","bank24_id":"262"} resp_body = {"task_id":"","status":"success","data":{"state":"init","applications":[{"status":"init","id":"123db561-34a3-4a8d-9fa7-03ed6377b44f","name":"Sulpak","amount":101000,"items":[{"name":"Switch CISCO x24","price":100000,"count":1,"amount":100000}]}],"segment":{"range":{"min":6,"max":36,"step":1},"payment_day":{"max":28,"min":1}}}}

进入这种类型的 json,或者任何其他格式(但我猜 json 是最好的一种)

{
   "time":"03:02:03.113",
   "class_req":"info",
   "ext_ref":"BANK24AOS_cl_reqmarketcreditorderstate_6M8I1NT8JKYD_1591844522410384_4SGA08M8KIXQ",
   "reqid":"1253166",
   "type":"INREQ",
   "channel":"BANK24AOS",
   "sid":"msid_1591844511335516_KRRNBSLH2FS",
   "duration":"703.991",
   "req_uri":"marketcredit/order/state",
   "login":"77012221122",
   "req_type":"cl_req",
   "req_headers":{
      "accept-encoding":"gzip",
      "connection":"close",
      "host":"test-mobileapp-api.bank.kz",
      "user-agent":"okhttp/4.4.1",
      "x-forwarded-for":"212.154.169.134",
      "x-real-ip":"212.154.169.134"
   },
   "req_body":{
      "$sid":"msid_1591844511335516_KRRNBSLH2FS",
      "$sid":"msid_1591844511335516_KRRNBSLH2FS",
      "app":"bank",
      "app_version":"2.3.2",
      "channel":"aos",
      "colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv",
      "colvir_commercial_id":"-1",
      "colvir_id":"000120.335980",
      "openway_commercial_id":"6247520",
      "openway_id":"6196360",
      "$lang":"ru",
      "ekb_id":"923243",
      "inn":"990830221722",
      "login":"77012221122",
      "bank24_id":"262"
   },
   "resp_body":{
      "task_id":"",
      "status":"success",
      "data":{
         "state":"init",
         "applications":[
            {
               "status":"init",
               "id":"123db561-34a3-4a8d-9fa7-03ed6377b44f",
               "name":"Sulpak",
               "amount":101000,
               "items":[
                  {
                     "name":"Switch CISCO x24",
                     "price":100000,
                     "count":1,
                     "amount":100000
                  }
               ]
            }
         ],
         "segment":{
            "range":{
               "min":6,
               "max":36,
               "step":1
            },
            "payment_day":{
               "max":28,
               "min":1
            }
         }
      }
   }
}

我正在尝试拆分第一个整个文本,但我遇到了另一个问题是根据“=”符号将键与值匹配。也可能有一些具有空值的键。例如 type = INREQ channel = sid = duration = 1.333:(要知道有一个空值,您需要注意空格的数量。通常prev.value 和 next key 之间有1个空格)。所以这个例子应该是这样的:

 {
   "type":"INREQ",
   "channel":"",
   "sid":"",
   "duration":"1.333"
 }

提前谢谢!

标签: pythonjsontext-parsinglogparser

解决方案


在这里,一件事传递了关于“$sid”的重复键:“msid_1591844511335516_KRRNBSLH2FS”

import re
text = """03:02:03.113 [info]  ext_ref =  reqid = 1253166 type = INREQ channel = BANK24AOS sid = msid_1591844511335516_KRRNBSLH2FS duration = 703.991 req_uri = marketcredit/order/state login = 77012221122 req_type = cl_req req_headers = {"accept-encoding":"gzip","connection":"close","host":"test-mobileapp-api.bank.kz","user-agent":"okhttp/4.4.1","x-forwarded-for":"212.154.169.134","x-real-ip":"212.154.169.134"} req_body = {"$sid":"msid_1591844511335516_KRRNBSLH2FS","$sid":"msid_1591844511335516_KRRNBSLH2FS","app":"bank","app_version":"2.3.2","channel":"aos","colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv","colvir_commercial_id":"-1","colvir_id":"000120.335980","openway_commercial_id":"6247520","openway_id":"6196360","$lang":"ru","ekb_id":"923243","inn":"990830221722","login":"77012221122","bank24_id":"262"} resp_body = {"task_id":"","status":"success","data":{"state":"init","applications":[{"status":"init","id":"123db561-34a3-4a8d-9fa7-03ed6377b44f","name":"Sulpak","amount":101000,"items":[{"name":"Switch CISCO x24","price":100000,"count":1,"amount":100000}]}],"segment":{"range":{"min":6,"max":36,"step":1},"payment_day":{"max":28,"min":1}}}}"""
index1 = text.index('[')
index2 = text.index(']')

new_text = 'time = '+ text[:index1-1] + ' class_req = ' + text[index1+1:index2] + text[index2+2:]

lst = re.findall(r'\S+? =  |\S+? = \{.*?\} |\S+? = \{.*?\}$|\S+? = \S+? ', new_text)

res = {}
for item in lst:
    key, equal, value = item.partition('=')
    key, value = key.strip(), value.strip()
    if value.startswith('{'):
        try:
            value = json.loads(value)
        except:
            print(value)
    res[key] = value

推荐阅读