python-3.x - 如何在此页面中找到要抓取的元素?
问题描述
该网站基本上是一个色板,如下所示: https ://prnt.sc/sux913
进入此页面后,我需要检查所有颜色,并让程序为我提供我指定尺寸的库存数量。(有时有人想知道所有颜色的可用数量。)。在这一点上,我很迷茫,因为我找不到我要在我的代码中引用的元素。如果指定的尺寸是 'L' ,我需要检查每种颜色并提供 L 的数量。例如,黑色 L - 9,海军 L - 23,红色 L - 334
<div class="prod__options">
<span class="attr_text">1.Pick a Color</span>
<p id="prod_color_swatch_area_id" class ="prod_color_swatch_area">
<a id="prod_color_box_Black" href ="javaScript: categoryDisplayJS.displayColorSelected('BK ', 'Black'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Black'); if(submitRequest()){ cursor_wait();
wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Black'});}" title ="Black">
<img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_Black_sw?$productSwatch$" border="1" onclick="setOptions('Black', true)" />
</a>
<a id="prod_color_box_Smoke Gray" href ="javaScript: categoryDisplayJS.displayColorSelected('8Q', 'Smoke Gray'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Smoke Gray'); if(submitRequest()){ cursor_wait();
wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Smoke Gray'});}" title ="Smoke Gray">
<img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_SmokeGray_sw?$productSwatch$" border="1" onclick="setOptions('SmokeGray', true)" />
</a>
<a id="prod_color_box_Charcoal Heather" href ="javaScript: categoryDisplayJS.displayColorSelected('HL', 'Charcoal Heather'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Charcoal Heather'); if(submitRequest()){ cursor_wait();
wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Charcoal Heather'});}" title ="Charcoal Heather">
<img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_CharcoalHeather_sw?$productSwatch$" border="1" onclick="setOptions('CharcoalHeather', true)" />
</a>
<a id="prod_color_box_Navy" href ="javaScript: categoryDisplayJS.displayColorSelected('NY', 'Navy'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Navy'); if(submitRequest()){ cursor_wait();
wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Navy'});}" title ="Navy">
<img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_Navy_sw?$productSwatch$" border="1" onclick="setOptions('Navy', true)" />
我尝试的最后一个代码如下:
html_content = requests.get(browser.current_url)
print(browser.current_url) # just to check what the URL is
print(html_content.raise_for_status())
soup = bs4.BeautifulSoup(html_content.text, 'html.parser')
ele_color_swatch = soup.select('#prod_color_swatch_area_id')
print(ele_color_swatch)
然而,这只是给了:
https://www... (some long url)
None
[]
解决方案
推荐阅读
- javascript - 这是什么日期时间格式?即:“1551927028”
- python - 内存核心转储 C++
- apache-kafka - 有没有办法使用 Kafka Confluent REST API 生成带有标头的 Kafka 消息?
- vb.net - 当我创建另一个程序副本时如何关闭现有正在运行的程序?
- python - 按文件标题的顺序重命名文件 python
- c - 为什么在 C 中的 if-else 条件中传递 (!NULL) 为真?
- jenkins - 在 Linux 机器上通过 Tomcat 安装 Jenkins 时无法找到 Jenkins 文件
- java - 使用一个字符串变量作为输入
- java - java 服务器使用 SHA256WithRSA 对消息进行签名,但 python 无法验证
- android - 从 AsyncTask 返回 ArrayList