Python 如何使用 boto3 将文件或数据写入 S3 对象

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40336918/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:24:32  来源:igfitidea点击:

How to write a file or data to an S3 object using boto3

pythonamazon-web-servicesamazon-s3botoboto3

提问by jkdev

In boto 2, you can write to an S3 object using these methods:

在 boto 2 中,您可以使用以下方法写入 S3 对象:

Is there a boto 3 equivalent? What is the boto3 method for saving data to an object stored on S3?

是否有 boto 3 等价物?将数据保存到存储在 S3 上的对象的 boto3 方法是什么?

回答by jkdev

In boto 3, the 'Key.set_contents_from_' methods were replaced by

在 boto 3 中,'Key.set_contents_from_' 方法被替换为

For example:

例如:

import boto3

some_binary_data = b'Here we have some data'
more_binary_data = b'Here we have some more data'

# Method 1: Object.put()
s3 = boto3.resource('s3')
object = s3.Object('my_bucket_name', 'my/key/including/filename.txt')
object.put(Body=some_binary_data)

# Method 2: Client.put_object()
client = boto3.client('s3')
client.put_object(Body=more_binary_data, Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')

Alternatively, the binary data can come from reading a file, as described in the official docs comparing boto 2 and boto 3:

或者,二进制数据可以来自读取文件,如比较 boto 2 和 boto 3 的官方文档所述

Storing Data

Storing data from a file, stream, or string is easy:

# Boto 2.x
from boto.s3.key import Key
key = Key('hello.txt')
key.set_contents_from_file('/tmp/hello.txt')

# Boto 3
s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))

存储数据

从文件、流或字符串中存储数据很容易:

# Boto 2.x
from boto.s3.key import Key
key = Key('hello.txt')
key.set_contents_from_file('/tmp/hello.txt')

# Boto 3
s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))

回答by EM Bee

boto3 also has a method for uploading a file directly:

boto3也有直接上传文件的方法:

s3.Bucket('bucketname').upload_file('/local/file/here.txt','folder/sub/path/to/s3key')

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file

回答by Franke

You no longer have to convert the contents to binary before writing to the file in S3. The following example creates a new text file (called newfile.txt) in an S3 bucket with string contents:

在写入 S3 中的文件之前,您不再需要将内容转换为二进制文件。以下示例在包含字符串内容的 S3 存储桶中创建一个新的文本文件(称为 newfile.txt):

import boto3

s3 = boto3.resource(
    's3',
    region_name='us-east-1',
    aws_access_key_id=KEY_ID,
    aws_secret_access_key=ACCESS_KEY
)
content="String content to write to a new S3 file"
s3.Object('my-bucket-name', 'newfile.txt').put(Body=content)

回答by Uri Goren

Here's a nice trick to read JSON from s3:

这是从 s3 读取 JSON 的一个不错的技巧:

import json, boto3
s3 = boto3.resource("s3").Bucket("bucket")
json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))

Now you can use json.load_s3and json.dump_s3with the same API as loadand dump

现在你可以使用json.load_s3json.dump_s3使用相同的APIloaddump

data = {"test":0}
json.dump_s3(data, "key") # saves json to s3://bucket/key
data = json.load_s3("key") # read json from s3://bucket/key

回答by kev

A cleaner and concise version which I use to upload files on the fly to a given S3 bucket and sub-folder-

一个更简洁简洁的版本,我用来将文件即时上传到给定的 S3 存储桶和子文件夹 -

import boto3

BUCKET_NAME = 'sample_bucket_name'
PREFIX = 'sub-folder/'

s3 = boto3.resource('s3')

# Creating an empty file called "_DONE" and putting it in the S3 bucket
s3.Object(BUCKET_NAME, PREFIX + '_DONE').put(Body="")

Note: You should ALWAYS put your AWS credentials (aws_access_key_idand aws_secret_access_key) in a separate file, for example- ~/.aws/credentials

注意:您应该始终将您的 AWS 凭证(aws_access_key_idaws_secret_access_key)放在一个单独的文件中,例如-~/.aws/credentials

回答by Uri Goren

it is worth mentioning smart-openthat uses boto3as a back-end.

值得一提boto3用作后端的smart-open

smart-openis a drop-in replacement for python's openthat can open files from s3, as well as ftp, httpand many other protocols.

smart-open是一个下拉更换为Python的open,可以从打开的文件s3,以及ftphttp和许多其他协议。

for example

例如

from smart_open import open
import json
with open("s3://your_bucket/your_key.json", 'r') as f:
    data = json.load(f)

The aws credentials are loaded via boto3 credentials, usually a file in the ~/.aws/dir or an environment variable.

aws 凭据通过boto3 凭据加载,通常是目录中的文件~/.aws/或环境变量。

回答by Prateek Bhuwania

You may use the below code to write, for example an image to S3 in 2019. To be able to connect to S3 you will have to install AWS CLI using command pip install awscli, then enter few credentials using command aws configure:

您可以使用以下代码编写,例如,在 2019 年将映像写入 S3。为了能够连接到 S3,您必须使用 command 安装 AWS CLI pip install awscli,然后使用 command输入一些凭证aws configure

import urllib3
import uuid
from pathlib import Path
from io import BytesIO
from errors import custom_exceptions as cex

BUCKET_NAME = "xxx.yyy.zzz"
POSTERS_BASE_PATH = "assets/wallcontent"
CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"


class S3(object):
    def __init__(self):
        self.client = boto3.client('s3')
        self.bucket_name = BUCKET_NAME
        self.posters_base_path = POSTERS_BASE_PATH

    def __download_image(self, url):
        manager = urllib3.PoolManager()
        try:
            res = manager.request('GET', url)
        except Exception:
            print("Could not download the image from URL: ", url)
            raise cex.ImageDownloadFailed
        return BytesIO(res.data)  # any file-like object that implements read()

    def upload_image(self, url):
        try:
            image_file = self.__download_image(url)
        except cex.ImageDownloadFailed:
            raise cex.ImageUploadFailed

        extension = Path(url).suffix
        id = uuid.uuid1().hex + extension
        final_path = self.posters_base_path + "/" + id
        try:
            self.client.upload_fileobj(image_file,
                                       self.bucket_name,
                                       final_path
                                       )
        except Exception:
            print("Image Upload Error for URL: ", url)
            raise cex.ImageUploadFailed

        return CLOUDFRONT_BASE_URL + id