首页 > 解决方案 > ValueError: 不支持或无效的 CSS 选择器“div[class='card-content””

问题描述

我有一个抓取网站数据的脚本,我在jupyter NoteBook上尝试它可以工作并返回预期的结果,但是当我在Visual Studio 代码编辑器上尝试它时,系统崩溃并显示以下错误:

 File "E:\anaconda\lib\site-packages\bs4\element.py", line 1426, in select
    'Unsupported or invalid CSS selector: "%s"' % token)
ValueError: Unsupported or invalid CSS selector: "div[class='card-content"

错误在哪里以及如何解决?

代码:

import time

import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(
    requests.get("https://www.bayt.com/en/international/jobs/executive-chef-jobs/").content,
    "lxml"
)


links = []
for a in soup.select("h2.m0.t-regular a"):
    if a['href'] not in links:
        links.append("https://www.bayt.com"+ a['href'])
joineddd = []

for link in links:
    s = BeautifulSoup(requests.get(link).content, "lxml")
    jobdesc=s.select_one("div[class='card-content is-spaced'] p")

    alldt = [dt.text for dt in s.select("div[class='card-content is-spaced'] dt")]
    dt_Job_location =              alldt[0]
    dt_Job_Company_Industry =      alldt[1]
    dt_Job_Company_Type =          alldt[2]
    if len(alldt[3])>0:
        dt_Job_Job_Role =              alldt[3]
    elif len(dt_Job_Employment_Type)>0:
        dt_Job_Employment_Type =       alldt[4]
            
    alldt.append("link")
    alldt.append("description")
    
    
    alldd = [dd.text for dd in s.select("div[class='card-content is-spaced'] dd")]
    dd_job_location =             alldd[0]
    dd_job_Company_Industry =     alldd[1]
    dd_job_Company_Type =         alldd[2]
    if len(alldd[3])>0:
        dd_job_Job_Role =             alldd[3]
    elif len(dd_job_Employment_Type)>0:
        dd_job_Employment_Type =      alldd[4]
    
    alldd.insert(0,link)
    alldd.insert(1,jobdesc)
    joineddd.append(alldd)
    
    print("-" * 80) 

在 jupyter NoteBook 上,脚本运行并通过打印破折号完成

在编辑器上系统崩溃。

标签: pythonweb-scrapingbeautifulsoupcss-selectors

解决方案


推荐阅读