首页 > 解决方案 > Python + selenium download image without extension

问题描述

I'm using python 3 with selenium, I have to download an image

HTML:

<img id="labelImage" name="labelImage" border="0" width="672" height="456" alt="labelImage" src="/shipping/labelAction.handle?method=doGetLabelFromCache&amp;isDecompressRequired=false&amp;utype=null&amp;cacheKey=774242409034SHIPPING_L">

Python code:

found = browser.find_element_by_css_selector('img[alt="labelImage"]') 
src = found.get_attribute('src')
urllib.request.urlretrieve(src, 'image.png')

that image file is empty, if I try to switch extension to html, shows me message below: "We're sorry, we can't process your request right now. It appears you don't have permission to view this webpage"

标签: pythonimageselenium

解决方案


您在尝试下载时收到的错误来自urllib调用是他们服务器的全新会话这一事实 - 它没有您的浏览器所做的 cookie 和身份验证。例如,这与您在浏览器中打开隐身模式并在地址栏中粘贴 src 属性相同 - 对于服务器,您是新客户端,尚未填写表单,已登录等。

您可能想尝试其他方法 - 在 selenium/浏览器会话中,仅截取<img>元素的屏幕截图。该操作取得了不同程度的成功,例如 Chrome 仅在最近才添加了对它的支持,在某些情况下它失败了:

found = browser.find_element_by_css_selector('img[alt="labelImage"]')
try:
    found.screenshot('element.png')
except Exception as ex:  # FIXME: anti-pattern - I don't recall the exact exception - when you run the code, change it to the proper one
    print('The correct exception is {}'.format(ex))
    browser.get_screenshot_as_file('page.png')

如果获取元素的屏幕截图失败,您将获得整个页面之一 - 然后您可以将其修剪到元素。


推荐阅读