首页 > 解决方案 > 使用 Lambda 将 Json 添加到 DynamoDB

问题描述

我正在尝试使用 Lambda 函数将具有以下结构的这个大 Json 文件(超过 8k 个事务)加载到 DynamoDB 中。

{
    "transactions": [
        {
            "customerId": "abc",
            "transactionId": "123",
            "transactionDate": "2020-09-01",
            "merchantId": "1234",
            "categoryId": "3",
            "amount": "5",
            "description": "McDonalds"
        },
        {
            "customerId": "def",
            "transactionId": "456",
            "transactionDate": "2020-09-01",
            "merchantId": "45678",
            "categoryId": "2",
            "amount": "-11.70",
            "description": "Tescos"
        },
        {
            "customerId": "jkl",
            "transactionId": "gah",
            "transactionDate": "2020-09-01",
            "merchantId": "9081",
            "categoryId": "3",
            "amount": "-139.00",
            "description": "Amazon"
        },
    ...

我尝试使用的 lambda 函数将在将 Json 文件上传到 S3 存储桶时触发。然后应该会自动将数据加载到 DynamoDB 中。lambda 函数目前有以下代码:

import json
s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')

def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    json_file_name = event['Records'][0]['s3']['object']['key']
    print(bucket)
    print(json_file_name)
    print(str(event))
    json_object = s3_client.get_object(Bucket=bucket,Key=json_file_name)
    jsonFileReader = json_object ['Body'].read()
    jsonDict = json.loads(jsonFileReader)
    table = dynamodb.Table('CustomerEvents')
    table.put_item(Item=jsonDict)
    return 'Hello from Lambda'

如果我尝试将一个独特的事务上传到 DynamoDB 中,这很好用,也就是说,如果文件的结构如下所示:

{
            "customerId": "abc",
            "transactionId": "123",
            "transactionDate": "2020-09-01",
            "merchantId": "1234",
            "categoryId": "3",
            "amount": "5",
            "description": "McDonalds"
 }

我该如何调整 lambda 函数以将所有事务(> 8k)加载到 DynamoDB 中?

标签: amazon-web-servicesamazon-s3aws-lambdaamazon-redshift

解决方案


您可以使用batch_writer从文件中写入多个transactions

一个例子是:

import json
import boto3

s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')

table = dynamodb.Table('CustomerEvents')

def lambda_handler(event, context):

    bucket = event['Records'][0]['s3']['bucket']['name']
    json_file_name = event['Records'][0]['s3']['object']['key']

    print(bucket)
    print(json_file_name)
    print(str(event))

    json_object = s3_client.get_object(Bucket=bucket,Key=json_file_name)
    jsonFileReader = json_object['Body'].read()
    jsonDict = json.loads(jsonFileReader)
    
    with table.batch_writer() as batch:
        for transaction in jsonDict['transactions']:
            print(transaction)
            batch.put_item(Item=transaction)

    return 'Hello from Lambda'

推荐阅读