Python 如何使用 boto3 将文件或数据写入 S3 对象
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40336918/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to write a file or data to an S3 object using boto3
提问by jkdev
In boto 2, you can write to an S3 object using these methods:
在 boto 2 中,您可以使用以下方法写入 S3 对象:
- Key.set_contents_from_string()
- Key.set_contents_from_file()
- Key.set_contents_from_filename()
- Key.set_contents_from_stream()
- Key.set_contents_from_string()
- Key.set_contents_from_file()
- Key.set_contents_from_filename()
- Key.set_contents_from_stream()
Is there a boto 3 equivalent? What is the boto3 method for saving data to an object stored on S3?
是否有 boto 3 等价物?将数据保存到存储在 S3 上的对象的 boto3 方法是什么?
回答by jkdev
In boto 3, the 'Key.set_contents_from_' methods were replaced by
在 boto 3 中,'Key.set_contents_from_' 方法被替换为
For example:
例如:
import boto3
some_binary_data = b'Here we have some data'
more_binary_data = b'Here we have some more data'
# Method 1: Object.put()
s3 = boto3.resource('s3')
object = s3.Object('my_bucket_name', 'my/key/including/filename.txt')
object.put(Body=some_binary_data)
# Method 2: Client.put_object()
client = boto3.client('s3')
client.put_object(Body=more_binary_data, Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')
Alternatively, the binary data can come from reading a file, as described in the official docs comparing boto 2 and boto 3:
或者,二进制数据可以来自读取文件,如比较 boto 2 和 boto 3 的官方文档中所述:
Storing Data
Storing data from a file, stream, or string is easy:
# Boto 2.x from boto.s3.key import Key key = Key('hello.txt') key.set_contents_from_file('/tmp/hello.txt') # Boto 3 s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))
存储数据
从文件、流或字符串中存储数据很容易:
# Boto 2.x from boto.s3.key import Key key = Key('hello.txt') key.set_contents_from_file('/tmp/hello.txt') # Boto 3 s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))
回答by EM Bee
boto3 also has a method for uploading a file directly:
boto3也有直接上传文件的方法:
s3.Bucket('bucketname').upload_file('/local/file/here.txt','folder/sub/path/to/s3key')
http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file
http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file
回答by Franke
You no longer have to convert the contents to binary before writing to the file in S3. The following example creates a new text file (called newfile.txt) in an S3 bucket with string contents:
在写入 S3 中的文件之前,您不再需要将内容转换为二进制文件。以下示例在包含字符串内容的 S3 存储桶中创建一个新的文本文件(称为 newfile.txt):
import boto3
s3 = boto3.resource(
's3',
region_name='us-east-1',
aws_access_key_id=KEY_ID,
aws_secret_access_key=ACCESS_KEY
)
content="String content to write to a new S3 file"
s3.Object('my-bucket-name', 'newfile.txt').put(Body=content)
回答by Uri Goren
Here's a nice trick to read JSON from s3:
这是从 s3 读取 JSON 的一个不错的技巧:
import json, boto3
s3 = boto3.resource("s3").Bucket("bucket")
json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))
Now you can use json.load_s3
and json.dump_s3
with the same API as load
and dump
现在你可以使用json.load_s3
和json.dump_s3
使用相同的APIload
和dump
data = {"test":0}
json.dump_s3(data, "key") # saves json to s3://bucket/key
data = json.load_s3("key") # read json from s3://bucket/key
回答by kev
A cleaner and concise version which I use to upload files on the fly to a given S3 bucket and sub-folder-
一个更简洁简洁的版本,我用来将文件即时上传到给定的 S3 存储桶和子文件夹 -
import boto3
BUCKET_NAME = 'sample_bucket_name'
PREFIX = 'sub-folder/'
s3 = boto3.resource('s3')
# Creating an empty file called "_DONE" and putting it in the S3 bucket
s3.Object(BUCKET_NAME, PREFIX + '_DONE').put(Body="")
Note: You should ALWAYS put your AWS credentials (aws_access_key_id
and aws_secret_access_key
) in a separate file, for example- ~/.aws/credentials
注意:您应该始终将您的 AWS 凭证(aws_access_key_id
和aws_secret_access_key
)放在一个单独的文件中,例如-~/.aws/credentials
回答by Uri Goren
it is worth mentioning smart-openthat uses boto3
as a back-end.
值得一提的是boto3
用作后端的smart-open。
smart-open
is a drop-in replacement for python's open
that can open files from s3
, as well as ftp
, http
and many other protocols.
smart-open
是一个下拉更换为Python的open
,可以从打开的文件s3
,以及ftp
,http
和许多其他协议。
for example
例如
from smart_open import open
import json
with open("s3://your_bucket/your_key.json", 'r') as f:
data = json.load(f)
The aws credentials are loaded via boto3 credentials, usually a file in the ~/.aws/
dir or an environment variable.
aws 凭据通过boto3 凭据加载,通常是目录中的文件~/.aws/
或环境变量。
回答by Prateek Bhuwania
You may use the below code to write, for example an image to S3 in 2019. To be able to connect to S3 you will have to install AWS CLI using command pip install awscli
, then enter few credentials using command aws configure
:
您可以使用以下代码编写,例如,在 2019 年将映像写入 S3。为了能够连接到 S3,您必须使用 command 安装 AWS CLI pip install awscli
,然后使用 command输入一些凭证aws configure
:
import urllib3
import uuid
from pathlib import Path
from io import BytesIO
from errors import custom_exceptions as cex
BUCKET_NAME = "xxx.yyy.zzz"
POSTERS_BASE_PATH = "assets/wallcontent"
CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"
class S3(object):
def __init__(self):
self.client = boto3.client('s3')
self.bucket_name = BUCKET_NAME
self.posters_base_path = POSTERS_BASE_PATH
def __download_image(self, url):
manager = urllib3.PoolManager()
try:
res = manager.request('GET', url)
except Exception:
print("Could not download the image from URL: ", url)
raise cex.ImageDownloadFailed
return BytesIO(res.data) # any file-like object that implements read()
def upload_image(self, url):
try:
image_file = self.__download_image(url)
except cex.ImageDownloadFailed:
raise cex.ImageUploadFailed
extension = Path(url).suffix
id = uuid.uuid1().hex + extension
final_path = self.posters_base_path + "/" + id
try:
self.client.upload_fileobj(image_file,
self.bucket_name,
final_path
)
except Exception:
print("Image Upload Error for URL: ", url)
raise cex.ImageUploadFailed
return CLOUDFRONT_BASE_URL + id