首页 > 解决方案 > 如何在此页面中找到要抓取的元素?

问题描述

该网站基本上是一个色板,如下所示: https ://prnt.sc/sux913

进入此页面后,我需要检查所有颜色,并让程序为我提供我指定尺寸的库存数量。(有时有人想知道所有颜色的可用数量。)。在这一点上,我很迷茫,因为我找不到我要在我的代码中引用的元素。如果指定的尺寸是 'L' ,我需要检查每种颜色并提供 L 的数量。例如,黑色 L - 9,海军 L - 23,红色 L - 334

              <div class="prod__options">
                <span class="attr_text">1.Pick a Color</span>
                <p id="prod_color_swatch_area_id" class ="prod_color_swatch_area">

                    <a id="prod_color_box_Black" href ="javaScript: categoryDisplayJS.displayColorSelected('BK                                                                                                                                                                                                                                                            ', 'Black'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Black'); if(submitRequest()){ cursor_wait();
                        wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Black'});}" title ="Black">

                                  <img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_Black_sw?$productSwatch$" border="1" onclick="setOptions('Black', true)" />

                    </a>   



                                                <a id="prod_color_box_Smoke Gray" href ="javaScript: categoryDisplayJS.displayColorSelected('8Q', 'Smoke Gray'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Smoke Gray'); if(submitRequest()){ cursor_wait();
                                                wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Smoke Gray'});}" title ="Smoke Gray">

                                                            <img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_SmokeGray_sw?$productSwatch$" border="1" onclick="setOptions('SmokeGray', true)" />

                                                </a>



                                                <a id="prod_color_box_Charcoal Heather" href ="javaScript: categoryDisplayJS.displayColorSelected('HL', 'Charcoal Heather'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Charcoal Heather'); if(submitRequest()){ cursor_wait();
                                                wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Charcoal Heather'});}" title ="Charcoal Heather">

                                                            <img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_CharcoalHeather_sw?$productSwatch$" border="1" onclick="setOptions('CharcoalHeather', true)" />

                                                </a>



                                                <a id="prod_color_box_Navy" href ="javaScript: categoryDisplayJS.displayColorSelected('NY', 'Navy'); categoryDisplayJS.displayPDPErrorSection(null,null,null); setCurrentId('prod_color_box_Navy'); if(submitRequest()){ cursor_wait();
                                                wc.render.updateContext('ProductPageMatrixDisplay_Context',{'productId':'497940','colorSelected':'Navy'});}" title ="Navy">

                                                            <img src="https://a248.e.akamai.net/f/248/9086/10h/origin-d5.scene7.com/is/image/Hanesbrands/HBI_498P_Navy_sw?$productSwatch$" border="1" onclick="setOptions('Navy', true)" />

我尝试的最后一个代码如下:

html_content = requests.get(browser.current_url)
print(browser.current_url) # just to check what the URL is
print(html_content.raise_for_status())
soup = bs4.BeautifulSoup(html_content.text, 'html.parser')
ele_color_swatch = soup.select('#prod_color_swatch_area_id')
print(ele_color_swatch)

然而,这只是给了:

https://www... (some long url)
None
[]

标签: python-3.xweb-scrapingelement

解决方案


推荐阅读