首页 > 解决方案 > 如何将 OpenImages 从 Sage Maker 笔记本直接下载到 S3 存储桶?

问题描述

我正在使用openimages API下载在labels_array. 此代码可轻松将所需的类下载到 Sage Maker 实例中的文件夹中。

from openimages.download import download_dataset
for labels in labels_array:
    download_dataset(
        data_location, 
        ["{}".format(labels)], 
        annotation_format="pascal"
)

但我想将它下载到 S3 存储桶。我找到了这个解决方案示例:

boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join('billing', 'billing_sm.csv')).upload_file('billing_sm.csv')

我无法弄清楚如何将此解决方案与 Sage Maker 上的 openimages API 一起使用。有人可以帮我理解这一点吗?

标签: amazon-s3object-detectionamazon-sagemaker

解决方案


在 Sage Maker 笔记本实例中创建一个 python 笔记本并使用以下代码。此代码将从 600 个打开的图像标签中下载所需标签的列表。一旦下载了一个类别,它就会被移动到 S3 存储桶并从笔记本实例中删除。

import os
import shutil
import openimages
from openimages.download import download_dataset

# create a directory where all the images will be downloaded in Sage Maker
os.mkdir('openImages/')

# create a list of the labels with either the first letter capital or all lowercase
labels_list = ["football","ambulance","ladder","toothbrush"]

# to download the list of labels to a S3 bucket
for labels in labels_list:
    download_dataset("openImages/", 
    # remove `.capitalize()` if you have the first letter of label in uppercase
                     [labels.capitalize()], 
                     annotation_format="pascal")
    # to move the files from sage Maker to S3
    os.system('aws s3 cp --recursive openImages/ s3://open-images/')
    # remove the folder from Sage Maker once the files are copied to S3
    shutil.rmtree("openImages/{}/".format(labels))

推荐阅读