首页 > 解决方案 > 如何使用python网络抓取提取每个产品的标题

问题描述

这是链接:https ://www.118100.se/sok/foretag/?q=brf&loc=&ob=rel&p=0

def get_index_data(soup):
try:
    links = soup.find_all('div','a',id=False).get('href')
except:
    links = []
print(links)

标签: web-scrapingpython-requests

解决方案


查找所有div具有class名称Name (class="Name")的 。它为您提供所有标题名称。如果您愿意href,则遍历所有内容titles并找到a具有.titletitle.text

import requests
import bs4 as bs

url = 'https://www.118100.se/sok/foretag/?q=brf&loc=&ob=rel&p=0'

response = requests.get(url)
# print('Response:', response.status_code)

soup = bs.BeautifulSoup(response.text, 'lxml')
titles = soup.find_all('div',  {'class': 'Name'})

# a = soup.find_all('a')
# print(a)

for title in titles:
    link = soup.find('a',  {'title': title.text}).get('href')
    print('https://www.118100.se' + link)

推荐阅读