首页 > 解决方案 > 如何比较一行中长度不同的两个数字并打印它们

问题描述

我有一个文件,我必须从中提取包含“TCP 0.0.0.0”和正在进行的文本的行,然后比较它旁边的两个数字并仅在它们的长度不相等时打印行。

我有下面的代码,它只提取包含“TCP 0.0.0.0”和正在进行的文本的行,但我需要通过比较它旁边的两个数字再次过滤,如果长度不相等则打印:

import re

f = open("log.txt", "r")
counter = 0
print("="*20)
for line in f:
  match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
  if match:
    counter += 1
    print("-"*10)

    # If you want to print the whole line
    print("Count {}:[F] {}".format(counter, line.rstrip()))

    # if you want to print just the matched section
    # print("Count {}:[M] {}".format(counter, match.groups()   [1].rstrip()))

print("="*20)
print("Total Found: {}".format(counter))
f.close()

日志.txt:

Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

 07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

 D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

需要从文件中打印以下三行。因为它包含“TCP 0.0.0.0”和正在进行的文本,所以“53408,533837”的数字长度也不相同(在正在进行的文本前面):

  07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first  packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

 07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

 D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

标签: pythonpython-3.xstring-matching

解决方案


您可以使用split('ongoing ')[1]之后获取所有文本"ongoing",然后您可以split(' ')[0:2]在之后获取两个数字"ongoing"

import re

data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''

f = data.split('\n')

for line in f:
    match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
    if match:
        second_part = line.split(' ongoing ')[1]
        numbers = second_part.split(' ')[:2]

        number1 = numbers[0]
        number2 = numbers[1]

        print(number1, 'len:', len(number1))
        print(number2, 'len:', len(number2))

        if len(number1) != len(number2):
            print('different lengths')

        print('---')

结果:

77010 len: 5
76760 len: 5
---
53408 len: 5
533837 len: 6
different lengths
---
770124 len: 6
76762 len: 5
different lengths
---
535 len: 3
533822 len: 6
different lengths

编辑:或者您可以创建更复杂的正则表达式来获取数字

re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)

代码:

import re

data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''

f = data.split('\n')

for line in f:
    match = re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)
    if match:
        number1 = match.group(2)
        number2 = match.group(3)

        print(number1, 'len:', len(number1))
        print(number2, 'len:', len(number2))

        if len(number1) != len(number2):
            print('different lengths')

        print('---')

推荐阅读