首页 > 解决方案 > 如何通过使用 BeautifulSoup 抓取 IP 地址并输出到 CSV?

问题描述

import requests    
from bs4 import BeautifulSoup

url ='https://myip.ms/browse/blacklist/Blacklist_IP_Blacklist_IP_Addresses_Live_Database_Real-time'

response = requests.get(url)

data = response.text

soup = BeautifulSoup(data, 'html.parser')

ipList = soup.find("td",{"class": "row_name"})

rows = ipList.findAll('td')
for tr in rows:
  cols = td.findAll('td')
  if len(cols) > 0:
     print (ip.cols.text.strip())

我正在使用 BeautifulSoup 进行网页抓取,但遇到了一些问题。我可以知道为什么我无法从数据库表中抓取 IP 地址。如何将结果输出到 CSV 文件?

标签: pythonweb-scrapingbeautifulsoupexport-to-csv

解决方案


问题是你正在使用find()ipList,它只获取一个 ip,你可以使用findall()或者select返回 ip 数组。

import requests
from bs4 import BeautifulSoup
url ='https://myip.ms/browse/blacklist/Blacklist_IP_Blacklist_IP_Addresses_Live_Database_Real-time'
response = requests.get(url).content
soup = BeautifulSoup(response, 'html.parser')
ipList = soup.select(".row_name")
with open('ip_output.csv', 'w') as f:
    for ips in ipList:
        f.write(ips.find('a').text + '\n')

输出输入csv

195.154.251.86
37.140.192.194
80.94.174.55
175.103.39.28
90.173.129.250
51.15.146.121
...
...

推荐阅读