python - 使用 Lambda 上传到 S3
问题描述
我创建了一个 lambda 函数,该函数从 S3 下载数据然后执行合并,然后将其重新上传回 S3,但我一直得到这个
错误 {“errorMessage”:“2020-05-18T23:23:27.556Z 37233f48-18ea-43eb-9030-3e8a2bf62048 任务在 3.00 秒后超时”}
当我删除 45 和 58 之间的线时,它工作得很好
import pandas as pd import numpy as np import time from io import StringIO # python3; python2:BytesIO import boto3 import s3fs from botocore.exceptions import NoCredentialsError
def lambda_handler(事件,上下文):
# Dataset 1
# loading the data
df1 = pd.read_csv("https://i...content-available-to-author-only...s.com/Minimum+Wage+Data.csv",encoding= 'unicode_escape')
# Renaming the columns.
df1.rename(columns={'High.Value': 'min_wage_by_law', 'Low.Value': 'min_wage_real'}, inplace=True)
# Removing all unneeded values.
df1 = df1.drop(['Table_Data','Footnote','High.2018','Low.2018'], axis=1)
df1 = df1.loc[df1['Year']>1969].copy()
# ---------------------------------
# Dataset 2
# Loading from the debt S3 bucket
df2 = pd.read_csv("https://i...content-available-to-author-only...s.com/USGS_Final_File.csv")
#Filtering getting the range in between 1969 and 2018.
df2 = df2.loc[df2['Year']>1969].copy()
df2 = df2.loc[df2['Year']<2018].copy()
df2.rename(columns={'Real State Growth %': 'Real State Growth','Population (million)':'Population Mil'}, inplace=True)
# Cleaning the data
df2['State Debt'] = df2['State Debt'].str.replace(',', '')
df2['Local Debt'] = df2['Local Debt'].str.replace(',', '')
df2["State and Local Debt"] = df2["State and Local Debt"].str.replace(',', '')
df2["Gross State Product"] = df2["Gross State Product"].str.replace(',', '')
# Cast to Floating
df2[["State Debt","Local Debt","State and Local Debt","Gross State Product"]] = df2[[ "State Debt","Local Debt","State and Local Debt","Gross State Product"]].apply(pd.to_numeric)
# --------------------------------------------
# Merge the data through an inner join.
full = pd.merge(df1,df2,on=['State','Year'])
#--------------------------------------------
filename = '/tmp/'#specify location of s3:/{my-bucket}/
file= 'debt_and_wage' #name of file
datetime = time.strftime("%Y%m%d%H%M%S") #timestamp
filenames3 = "%s%s%s.csv"%(filename,file,datetime) #name of the filepath and csv file
full.to_csv(filenames3, header = True)
## Saving it on AWS
s3 = boto3.resource('s3',aws_access_key_id='accesskeycantshare',aws_secret_access_key= 'key')
s3.meta.client.upload_file(filenames3, 'information-arch',file+datetime+'.csv')
解决方案
您的默认lambda 执行超时为3 seconds。请将其增加到适合您的任务:
超时 – Lambda 允许函数在停止之前运行的时间量。默认值为3 秒。最大允许值为900 秒。
推荐阅读
- windows - 使用需要交互的弹出屏幕远程安装软件安装程序
- python - 试图覆盖在 python 3.9 中识别的 Django 3.2 应用程序上的 allauth 模板
- python - 在循环中创建 Tkinter 按钮并为每个按钮设置命令以获取循环的索引
- c - 该程序在第 37 行显示错误,我试图显示一个字符,但它说它是一个整数。因此,它不显示第 37 行并跳过它
- ios - SwiftUI - 将 post.id 传递给 ViewModel
- flutter - _SplashScreenViewState#32282(ticker active) 被设置为一个活动的 Ticker
- azure - 如何在 Azure Linux Function 中安装字体?
- vuejs2 - 如何在Vuejs中将表单循环三次
- amazon-web-services - 在 Redshift 中构建渐变维度类型 2
- android - 如何在 Android 网络浏览器上通过 agora 使用通话声音而不是媒体声音?