首页 > 解决方案 > 以某种模式组织的解析文件

问题描述

f 是一个文件,如下所示:

+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.1

+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http
IP Address: 192.168.1.2

+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.3

+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL
IP Address: 192.168.1.4

+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH
IP Address: 192.168.1.5

+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.6

预期的输出是:

These hosts have Apache services:

192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.6

我试过的代码:

for service in f:
    if "product: Apache httpd" in service:
        for host in f:
            if "IP Address: " in host:
                print(host[5:], service)

它只是给了我所有的 IP 地址,而不是安装了 Apache 的特定主机。

我怎样才能做出预期的输出?

标签: pythonformatted-input

解决方案


可能是这样的。出于说明目的,我已内联数据,但它也可以来自文件。

此外,我们首先收集每个主机的所有数据,以防您还需要一些其他信息,然后打印出所需的信息。这意味着info_by_ip看起来大致像

{'192.168.1.1': {'Port Number': '80', 'product': 'Apache httpd'},
 '192.168.1.2': {'Port Number': '80', 'product': 'Apache http'},
 '192.168.1.3': {'Port Number': '80', 'product': 'Apache httpd'},
 '192.168.1.4': {'Port Number': '3306', 'product': 'MySQL'},
 '192.168.1.5': {'Port Number': '22', 'product': 'Open SSH'},
 '192.168.1.6': {'Port Number': '80', 'product': 'Apache httpd'}}

.

代码:

import collections

data = """
+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd

+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http

+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd

+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL

+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH

+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
"""

ip = None  # Current IP address

# A defaultdict lets us conveniently add per-IP data without having to
# create the inner dicts explicitly:
info_by_ip = collections.defaultdict(dict)

for line in data.splitlines():  # replace with `for line in file:` for file purposes
    if line.startswith('+++++'):  # Seems like an IP address separator
        ip = line.strip('+')  # Remove + signs from both ends
        continue  # Skip to next line
    if ':' in line:  # If the line contains a colon,
        key, value = line.split(':', 1)  # ... split by it, 
        info_by_ip[ip][key.strip()] = value.strip()  # ... and add to this IP's dict.


for ip, info in info_by_ip.items():
    if info.get('product') == 'Apache httpd':
        print(ip)

推荐阅读