python - AWS Batch - How to access AWS Batch environment variables within python script running inside Docker container
问题描述
I have a Docker container which executes a python script inside it as the ENTRYPOINT. This is the DockerFile
FROM python:3
ADD script.py /
EXPOSE 80
RUN pip install boto3
RUN pip install uuid
ENTRYPOINT ["python","./script.py"]
This is the Python script:
import boto3
import time
import uuid
import os
guid = uuid.uuid4()
timestr = time.strftime("%Y%m%d-%H%M%S")
job_index = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']
filename = 'latest_test_' + str(guid) + '_.txt'
with open(filename, 'a+') as f:
data = job_index
f.write(data)
client = boto3.client(
's3',
# Hard coded strings as credentials, not recommended.
aws_access_key_id='',
aws_secret_access_key=''
)
response = client.upload_file(filename, 'api-dev-dpstorage-s3', 'docker_data' + filename + '.txt')
with open('response2.txt', 'a+') as f:
f.write('all done')
exit
It is simply designed to create a file, write the job array index into the file and push it to an S3 Bucket. The job array index from AWS Batch is being sourced from one of the pre-defined environment variables. I have uploaded the image to AWS ECR, and have set up an AWS Batch to run a job with an array of 10. This should execute the job 10 times, with my expectation that 10 files are dumped into S3, each containing the array index of the job itself.
If I don't include the environment variable and instead just hard code a value into the text file, the AWS Batch job works. If I include the call to os.environ to get the variable, the job fails with this AWS Batch error:
Status reasonEssential container in task exited
I'm assuming there is an issue with how I'm trying to obtain the environment variable. Does anyone know how I could correctly reference either one of the built in environment variables and/or a custom environment variable defined in the job?
解决方案
AWS 通过作业定义参数提供 dockerenv
配置,您可以在其中指定:
"environment" : [
{ "AWS_BATCH_JOB_ARRAY_INDEX" : "string"},
]
这将变成docker env参数:
$ docker run --env AWS_BATCH_JOB_ARRAY_INDEX=string $container $cmd
因此可以通过以下方式访问:
import os
job_id = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']
但是请注意,如果您以这种方式传递敏感数据,那么以纯文本形式传递凭据是不明智的。相反,在这种情况下,您可能想要创建一个计算环境。
推荐阅读
- excel - VBA - 尝试保存文件时出错 - 对象不支持属性或方法?
- json - 如何在 JSON 中重叠引号?
- azure - 用于在 azure devops.Apache maven 上创建基于 java 的 Web 应用程序的管道以创建 build 。下面的yaml代码和我得到的输出错误
- angular - Angular 8 - 将文件拆分成更小的块
- node.js - 从 nextjs API 路由发送文件
- python-3.x - 熊猫没有在最后一列进行总和计算
- mysql - 我需要找出我的数据库中最受欢迎的销售产品,同时显示产品 ID、名称和销售数量
- javascript - 为什么在检查假图像文件时无法上传doc和pdf类型的文件
- sql - MongoDB 通过一个公共列将 2 个数据库合并为 1 个
- pyspark - 我可以将 PyFlink 与 PyTorch/Tensorflow/ScikitLearn/Xgboost/LightGBM 一起使用吗?