python-3.x - 使用硒进行网页抓取时如何获取文本值？

问题描述

我正在抓取一个页面。<span class="product_content_brand"> NikeLab </span> 我在 python3 上得到了带有硒的元素。

from selenium import webdriver


browser= webdriver.Chrome("/home/desarrollo10/Downloads/
chromedriver_linux64/chromedriver")

browser.get("https://theurge.com.au/")
C=browser.find_element_by_tag_name("a").click()
time.sleep(0.5)
D=browser.find_element_by_class_name("tag-filters_clearall").click()

S=browser.find_elements_by_class_name("product_content")

for s in S:
    print(s.text)

我想从“product_content”类的元素中获取文本，我得到：

WebDriverException：消息：chrome 无法访问（会话信息：chrome=71.0.3578.98）（驱动程序信息：chromedriver=2.44.609551（5d576e9a44fe4c5b6a07e568f1ebc753f1214634），平台=Linux 4.15.0-43-通用 x86_64）

标签： python-3.xseleniumweb-scraping

尝试在此处找到有关在启动 Chrome 时添加几个参数（no-sandbox、disable-setuid-sandbox）的可能解决方案：

chrome_options = Options()
#argument to switch off suid sandBox and no sandBox in Chrome 
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-setuid-sandbox")

browser= webdriver.Chrome("/home/desarrollo10/Downloads/chromedriver_linux64/chromedriver", chrome_options=chrome_options)

然后还有：

我没有看到class = "tag-filters_clearall"，而是class = "tag-filters_clear-all"

所以我认为你的意思是：

D=browser.find_element_by_class_name("tag-filters_clear-all").click()

不是：

D=browser.find_element_by_class_name("tag-filters_clearall").click()

python-3.x - 使用硒进行网页抓取时如何获取文本值？

问题描述

解决方案

推荐阅读