python - 美丽的汤刮电影标题和图像
问题描述
我尝试按照课程进行操作,但由于网站内容和标签已更改,因此我被困在一个示例中。在课程中,标签看起来:
现在是,但即使我改变课程,我也无法返回任何东西。我想抓取电影标题和图像。html图片图片在这里。response = requests.get('https://www.empireonline.com/movies/features/best-movies-2')
best_movies = response.text
soup = BeautifulSoup(best_movies, 'html.parser')
titles = soup.find_all(name = 'h3', class_ = 'jsx-2692754980')
print(titles)
imgs = soup.find_all(name='img', class_='jsx-4015086601')
print(imgs)
imgs_li=[]
for e in imgs:
link = e.get('src')
imgs_li.append(link)
print(imgs_li)
解决方案
您可能想selenium
与BeautifulSoup
.
就是这样:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options)
driver.get("https://www.empireonline.com/movies/features/best-movies-2")
soup = BeautifulSoup(driver.page_source, "html.parser").find_all("img")
movies = []
for image in soup:
try:
if image["alt"]:
movies.append([image["alt"], f"https:{image['data-src']}"])
except KeyError:
continue
for movie in movies[1:]:
title, link = movie
print(f"{title}\n{link}\n{'-' * 80}")
输出:
Stand By Me
https://cdn.onebauer.media/one/media/5e62/24d4/08ba/aa5a/8143/279c/stand-by-me.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Raging Bull
https://cdn.onebauer.media/one/media/5d2d/d990/853e/7cd6/60cc/fa2e/raging-bull.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Amelie
https://cdn.onebauer.media/one/empire-images/features/59395a49f68e659c7aa3a1a8/Amelie.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Leonardo DiCaprio and Kate Winslet in Titanic
https://cdn.onebauer.media/one/lifestyle-images/celebrity/59d4ac2c07c78ace382c4735/kate-winslet-leonardo-dicaprio-titanic.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Good Will Hunting
https://cdn.onebauer.media/one/media/5e62/2a32/2cd5/547b/bf0f/6416/good-will-hunting.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Arrival
https://cdn.onebauer.media/one/media/5e62/2ac7/2eea/4450/3534/4b45/Arrival.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
Lost In Translation
https://cdn.onebauer.media/one/media/5e62/2b5f/232f/f064/694b/c738/lost-in-translation.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
The Princess Bride
https://cdn.onebauer.media/one/media/5e62/2bf3/08ba/aa7b/8f43/27e0/the-princess-bride.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
The Terminator
https://cdn.onebauer.media/one/empire-images/features/59395a49f68e659c7aa3a1a8/The%2520Terminator.jpg?format=jpg&quality=80&width=500&ratio=1-1&resize=aspectfit
--------------------------------------------------------------------------------
and so on ...
推荐阅读
- arrays - Perl 从两个二维数组中减去值
- java - 我想将字母 xxx-xxx-xxxx 字符串更改为 333-333-3333 我尝试了这个但在 if(phoneNumber.charAt(0)) 中出现错误
- python - mypy 无法正确推断生成器理解的类型
- javascript - JS第一次点击不进入第一个块
- node.js - node-fetch 发送 post 请求,正文为 x-www-form-urlencoded
- python - 为什么我的模型在 keras 顺序模型中表现不佳?
- c - DS18B20+ 数字温度传感器:如何处理传感器输出的数据?
- javascript - 除了字符串中的最后一次出现之外,如何删除特定的连续符号?
- java - Spring-data:创建名称为“mainController”的bean时出错:通过字段“userService”表示的依赖关系不满足
- python - 通过python插入到varchar