Python 使用 Lambda 从 S3 读取数据

Question

提问by LearningSlowly

I have a range of json files stored in an S3 bucket on AWS.

我在 AWS 上的 S3 存储桶中存储了一系列 json 文件。

I wish to use AWS lambda python service to parse this json and send the parsed results to an AWS RDS MySQL database.

我希望使用 AWS lambda python 服务来解析这个 json 并将解析结果发送到 AWS RDS MySQL 数据库。

I have a stable python script for doing the parsing and writing to the database. I need to lambda script to iterate through the json files (when they are added).

我有一个稳定的 python 脚本，用于解析和写入数据库。我需要 lambda 脚本来遍历 json 文件（添加它们时）。

Each json file contains a list, simple consisting of results = [content]

每个json文件包含一个列表，简单的由 results = [content]

In pseudo-code what I want is:

在伪代码中，我想要的是：

Connect to the S3 bucket (jsondata)
Read the contents of the JSON file (results)
Execute my script for this data (results)

连接到 S3 存储桶 ( jsondata)
读取 JSON 文件的内容 ( results)
为该数据执行我的脚本 ( results)

I can list the buckets I have by:

我可以通过以下方式列出我拥有的桶：

import boto3

s3 = boto3.resource('s3')

for bucket in s3.buckets.all():
    print(bucket.name)

Giving:

给予：

jsondata

But I cannot access this bucket to read its results.

但我无法访问此存储桶以读取其结果。

There doesn't appear to be a reador loadfunction.

似乎没有 aread或load函数。

I wish for something like

我希望像

for bucket in s3.buckets.all():
   print(bucket.contents)

EDIT

编辑

I am misunderstanding something. Rather than reading the file in S3, lambda must download it itself.

我误解了一些东西。lambda 必须自己下载文件，而不是在 S3 中读取文件。

From hereit seems that you must give lambda a download path, from which it can access the files itself

从这里看来，您必须给 lambda 一个下载路径，它可以从中访问文件本身

import libraries

s3_client = boto3.client('s3')

def function to be executed:
   blah blah

def handler(event, context):
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key'] 
        download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
        s3_client.download_file(bucket, key, download_path)

Answer 1

采纳答案by Dysosmus

You can use bucket.objects.all()to get a list of the all objects in the bucket (you also have alternative methods like filter, page_sizeand limitdepending on your need)

您可以使用bucket.objects.all()来获得在桶中的所有对象的列表（也有替代方法，比如filter，page_size以及limit根据您的需要）

These methods return an iterator with S3.ObjectSummaryobjects in it, from there you can use the method object.getto retrieve the file.

这些方法返回一个包含S3.ObjectSummary对象的迭代器，从那里你可以使用该方法object.get来检索文件。

Answer 2

回答by James Hogbin

s3 = boto3.client('s3')
response = s3.get_object(Bucket=bucket, Key=key)
emailcontent = response['Body'].read().decode('utf-8')

Python 使用 Lambda 从 S3 读取数据

提问by LearningSlowly

采纳答案by Dysosmus

回答by James Hogbin

相关推荐

最近更新

标签

Python 使用 Lambda 从 S3 读取数据

提问by LearningSlowly

采纳答案by Dysosmus

回答by James Hogbin

相关推荐

Python：函数可以返回数组和变量吗？

在python中计算对数

Python 列表理解和“不在”

Python3 整数除法

相关推荐

最近更新

标签