首页 > 解决方案 > 我怎样才能带回所有条目而不是只带回第一个?

问题描述

我对 python 还很陌生,而且总体上仍在学习编程。

我正在从这个页面寻找网络爬虫标题和艺术家:https ://www.billboard.com/charts/country-airplay/1990-01-20

并将它们排列成表格格式。

我已经能够使用以下内容使用 bs4/requests 提取项目:

for title in soup.find_all('div', attrs={'class':'chart-list-item__title'}):
    print(title.text)

for artist in soup.find_all('div', attrs={'class':'chart-list-item__artist'}):
    print(artist.text)

但是当我尝试将对象设置为变量时,它只会带回第一项。

title1 = title.text
print(title1)

我怎样才能把所有的领带带回来?

import requests
r = requests.get('https://www.billboard.com/charts/country-airplay/1990-01-20')

from bs4 import BeautifulSoup
soup = BeautifulSoup(r.text, 'html.parser')

for title in soup.find_all('div', attrs={'class':'chart-list-item__title'}):
    print(title.text)

for artist in soup.find_all('div', attrs={'class':'chart-list-item__artist'}):
    print(artist.text)

title1 = title.text
print(title1)

标签: python-3.xweb-scraping

解决方案


使用此类定义一个循环chart-list-item,然后在该循​​环中指定您想要抓取的字段。鉴于以下脚本应生成rank,artistalbum名称。

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.billboard.com/charts/country-airplay/1990-01-20')
soup = BeautifulSoup(r.text, 'html.parser')

for item in soup.find_all(class_="chart-list-item"):
    rank = item.find(class_="chart-list-item__rank").get_text(strip=True)
    artist = item.find(class_="chart-list-item__artist").get_text(strip=True)
    album = item.find(class_="chart-list-item__title-text").get_text(strip=True)
    print(rank,artist,album)

输出如下:

1 Clint Black Nobody's Home
2 Tanya Tucker My Arms Stay Open All Night
3 Ricky Van Shelton Statue Of A Fool
4 Alabama Southern Star
5 Keith Whitley It Ain't Nothin'

推荐阅读