python - 如何比较一行中长度不同的两个数字并打印它们
问题描述
我有一个文件,我必须从中提取包含“TCP 0.0.0.0”和正在进行的文本的行,然后比较它旁边的两个数字并仅在它们的长度不相等时打印行。
我有下面的代码,它只提取包含“TCP 0.0.0.0”和正在进行的文本的行,但我需要通过比较它旁边的两个数字再次过滤,如果长度不相等则打印:
import re
f = open("log.txt", "r")
counter = 0
print("="*20)
for line in f:
match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
if match:
counter += 1
print("-"*10)
# If you want to print the whole line
print("Count {}:[F] {}".format(counter, line.rstrip()))
# if you want to print just the matched section
# print("Count {}:[M] {}".format(counter, match.groups() [1].rstrip()))
print("="*20)
print("Total Found: {}".format(counter))
f.close()
日志.txt:
Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
需要从文件中打印以下三行。因为它包含“TCP 0.0.0.0”和正在进行的文本,所以“53408,533837”的数字长度也不相同(在正在进行的文本前面):
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
解决方案
您可以使用split('ongoing ')[1]
之后获取所有文本"ongoing"
,然后您可以split(' ')[0:2]
在之后获取两个数字"ongoing"
import re
data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''
f = data.split('\n')
for line in f:
match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
if match:
second_part = line.split(' ongoing ')[1]
numbers = second_part.split(' ')[:2]
number1 = numbers[0]
number2 = numbers[1]
print(number1, 'len:', len(number1))
print(number2, 'len:', len(number2))
if len(number1) != len(number2):
print('different lengths')
print('---')
结果:
77010 len: 5
76760 len: 5
---
53408 len: 5
533837 len: 6
different lengths
---
770124 len: 6
76762 len: 5
different lengths
---
535 len: 3
533822 len: 6
different lengths
编辑:或者您可以创建更复杂的正则表达式来获取数字
re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)
代码:
import re
data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''
f = data.split('\n')
for line in f:
match = re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)
if match:
number1 = match.group(2)
number2 = match.group(3)
print(number1, 'len:', len(number1))
print(number2, 'len:', len(number2))
if len(number1) != len(number2):
print('different lengths')
print('---')
推荐阅读
- c# - 在c#中压缩时保留文件夹权限
- terraform - 如何使 Terraform 提供程序支持导入?
- jsf - 动态primefaces menuItem不会触发命令
- excel - 使用表格中的数据在 Excel 中查找最接近的匹配项
- apache-pig - 在 GCP Dataproc 上运行 Bash 脚本
- javascript - 如何在 React.js 中将事件函数分配给 DOM 列表
- realm - SwiftUI Navigation 自动关闭/弹出 - 领域
- c++ - 如何在没有现有引擎的游戏中设计我的 EventTrigger 框架?
- javascript - JS 从对象数组中过滤某些部分
- swift4 - 如何修复“读写数据沙箱:使用 Mac Catalyst 时出错”