首页 > 解决方案 > 图像下载但字节大小为 0 同时响应是 respons.ok 和 python 中的 200 代码?

问题描述

我正在同时下载多张图片,但它给了我 respons.ok 和 200 个代码,但下载的文件大小为 0 字节。我的代码:

pic_list=['https://i8.amplience.net/i/nlyscandinavia/146368-0014_01/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_02/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_04/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_05/i-straight-crepe-pant/']
for pic_url in pic_list:
    url = str(pic_url).replace(' ', '')
    print('pic_url : ' + str(url))
    folder = full_move_dir + '/' + str(folder_count)
    print('after creating folder' + folder)
    os.makedirs(folder, exist_ok=True)
    try:
        pic_ext=str(pic_url.split('.')[-1])
        final_pic_ext = pic_ext.split('/')[0]
        print(type(final_pic_ext))
        print(final_pic_ext)
        if final_pic_ext != 'jpeg' and final_pic_ext != 'jpg' and final_pic_ext != 'png':
            final_pic_ext = 'jpeg'
            print(final_pic_ext)
        pic_name = str(pic_count) + '-' + str(file_name_for_folder[-1]) + '.' + str(final_pic_ext)
    except Exception as e:
        print(e)
    with open(os.path.join(folder, pic_name), 'wb') as handle:
        if url.find('http') == -1:
            url_h = 'http://' + url
            try:
                headers = {
                    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0'}
                r = requests.get(url_h, headers=headers)
                print(r)
                if not r.ok:
                    print("NO OK res"+str(r))
                else:
                    pic_count = pic_count + 1
                    handle.write(r)
                    sleep(1)
            except Exception as e:
                print('Invalid URL with http')
            finally:
                pass
        else:
            try:
                headers = {
                    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0'}
                r = requests.get(url, headers=headers)
                print(r)
                if not r.ok:
                    print("NO OK res"+str(r))
                else:
                    pic_count = pic_count + 1
                    handle.write(r)
                    sleep(1)
            except Exception as e:
                print('Invalid URL with http')
            finally:
                pass

添加标题是因为首先它允许我用餐所以我得到了这个解决方案......现在这个方法成功下载单张照片但无法下载多张图片..

标签: pythonmachine-learningweb-scrapingpython-requests

解决方案


变量r是一个请求对象。图像内容存储在r.content中。

这是解决您问题的简单方法

import requests
import os

pic_list=['https://i8.amplience.net/i/nlyscandinavia/146368-0014_01/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_02/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_04/i-straight-crepe-pant/', 'https://i8.amplience.net/i/nlyscandinavia/146368-0014_05/i-straight-crepe-pant/']
DIR_TO_SAVE = '.'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0'
}
i=0
for pic_url in pic_list:
    url = pic_url.strip()
    print('pic_url: '+url)
    if url[-1] == '/':
        filename = url.rstrip('/').split('/')[-1]+str(i)+'.jpeg'
        i+=1
    else:
        filename = url.split('/')[-1]
    output = requests.get(url, headers=headers)
    if output.status_code == 200:
        with open(os.path.join(DIR_TO_SAVE, filename), 'wb') as f:
            f.write(output.content)
    else:
        print("Couldnt get file: "+url)

推荐阅读