首页 > 解决方案 > AWS Batch - How to access AWS Batch environment variables within python script running inside Docker container

问题描述

I have a Docker container which executes a python script inside it as the ENTRYPOINT. This is the DockerFile

FROM python:3
ADD script.py / 
EXPOSE 80
RUN pip install boto3
RUN pip install uuid
ENTRYPOINT ["python","./script.py"]

This is the Python script:

import boto3
import time
import uuid
import os

guid = uuid.uuid4()
timestr = time.strftime("%Y%m%d-%H%M%S")
job_index = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']

filename = 'latest_test_' + str(guid) + '_.txt'
with open(filename, 'a+') as f:
    data = job_index
    f.write(data)

client = boto3.client(
    's3',
    # Hard coded strings as credentials, not recommended.
    aws_access_key_id='',
    aws_secret_access_key=''
)
response = client.upload_file(filename, 'api-dev-dpstorage-s3', 'docker_data' + filename + '.txt')
with open('response2.txt', 'a+') as f:
    f.write('all done')
    exit

It is simply designed to create a file, write the job array index into the file and push it to an S3 Bucket. The job array index from AWS Batch is being sourced from one of the pre-defined environment variables. I have uploaded the image to AWS ECR, and have set up an AWS Batch to run a job with an array of 10. This should execute the job 10 times, with my expectation that 10 files are dumped into S3, each containing the array index of the job itself.

If I don't include the environment variable and instead just hard code a value into the text file, the AWS Batch job works. If I include the call to os.environ to get the variable, the job fails with this AWS Batch error:

Status reasonEssential container in task exited

I'm assuming there is an issue with how I'm trying to obtain the environment variable. Does anyone know how I could correctly reference either one of the built in environment variables and/or a custom environment variable defined in the job?

标签: pythondockeraws-batch

解决方案


AWS 通过作业定义参数提供 dockerenv配置,您可以在其中指定:

"environment" : [
    { "AWS_BATCH_JOB_ARRAY_INDEX" : "string"},
]

这将变成docker env参数:

$ docker run --env AWS_BATCH_JOB_ARRAY_INDEX=string $container $cmd

因此可以通过以下方式访问

import os

job_id = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']

但是请注意,如果您以这种方式传递敏感数据,那么以纯文本形式传递凭据是不明智的。相反,在这种情况下,您可能想要创建一个计算环境


推荐阅读