首页 > 解决方案 > Python BeautifulSoup 访问 Div 容器

问题描述

我正在尝试使用 BeautifulSoup 从包含品牌、产品名称、价格等的产品详细信息页面下方获取容器。

根据 chrome site-inspection,它是来自“ product-detail__info ”类的“ div ”容器(请看截图)

不幸的是,我的代码确实有效......

如果有人能给我小费,我将不胜感激:)

提前致谢

链接:https ://www.nemlig.com/opvasketabs-all-in-one-5039333

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "https://www.nemlig.com/opvasketabs-all-in-one-5039333"

#Opening connection and grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()

#Closing connection
uClient.close()

#html parsing
page_soup = soup(page_html, "html.parser")

#grabs product detail container
container = page_soup.find_all("div", {"class": "product-detail__info"})



print(container)

在此处输入图像描述

标签: pythonweb-scrapingbeautifulsoup

解决方案


您要查找的数据是源页面的一部分(作为脚本)。
这是将其返回给您的代码:

import requests
from bs4 import BeautifulSoup as soup
import json

r = requests.get('https://www.nemlig.com/opvasketabs-all-in-one-5039333')
if r.status_code == 200:
    soup = soup(r.text, "html.parser")
    scripts = soup.find_all("script")
    data = json.loads(scripts[6].next.strip()[:-1])
    print(data)

输出

[{'@context': 'http://schema.org/', '@type': 'Organization', 'url': 'https://www.nemlig.com/', 'logo': 'https://www.nemlig.com/https://live.nemligstatic.com/s/b1.0.7272.30289/scom/dist/images/logos/nemlig-web-logo_tagline_rgb.svg', 'contactPoint': [{'@type': 'ContactPoint', 'telephone': '+45 70 33 72 33', 'contactType': 'customer service'}], 'sameAs': ['https://www.facebook.com/nemligcom/', 'https://www.instagram.com/nemligcom/', 'https://www.linkedin.com/company/nemlig-com']}, {'@context': 'http://schema.org/', '@type': 'Product', 'name': 'Opvasketabs all in one', 'brand': 'Ecover', 'image': 'https://live.nemligstatic.com/scommerce/images/opvasketabs-all-in-one.jpg?i=ZowWdq-y/5039333', 'description': '25 stk. / zero / Ecover', 'category': 'Maskinopvask', 'url': 'https://www.nemlig.com/opvasketabs-all-in-one-5039333', 'offers': {'@type': 'Offer', 'priceCurrency': 'DKK', 'price': '44.95'}}]

推荐阅读