如何使用 url 访问 Python 中的 s3 文件？

Question

提问by Nate Reed

I want to write a Python script that will read and write files from s3 using their url's, eg:'s3:/mybucket/file'. It would need to run locally and in the cloud without any code changes. Is there a way to do this?

我想编写一个 Python 脚本，该脚本将使用 s3 的 url 读取和写入文件，例如：'s3:/mybucket/file'。它需要在本地和云端运行，无需任何代码更改。有没有办法做到这一点？

Edit: There are some good suggestions here but what I really want is something that allows me to do this:

编辑：这里有一些很好的建议，但我真正想要的是允许我这样做的东西：

 myfile = open("s3://mybucket/file", "r")

and then use that file object like any other file object. That would be really cool. I might just write something like this for myself if it doesn't exist. I could build that abstraction layer on simples3 or boto.

然后像使用任何其他文件对象一样使用该文件对象。那真的很酷。如果它不存在，我可能只是为自己写这样的东西。我可以在 simples3 或 boto 上构建抽象层。

Answer 1

回答by David Wolever

I haven't seen something that would work directly with S3 urls, but you could use an S3 access library(simples3looks decent) and some simple string manipulation:

我还没有看到可以直接使用 S3 url 的东西，但是您可以使用S3 访问库（simples3看起来不错）和一些简单的字符串操作：

>>> url = "s3:/bucket/path/"
>>> _, path = url.split(":", 1)
>>> path = path.lstrip("/")
>>> bucket, path = path.split("/", 1)
>>> print bucket
'bucket'
>>> print path
'path/'

Answer 2

回答by Anto Binish Kaspar

You can use Boto Python APIfor accessing S3 by python. Its a good library. After you do the installation of Boto, following sample programe will work for you

您可以使用Boto Python API通过 python 访问 S3。它是一个很好的图书馆。安装 Boto 后，以下示例程序将为您工作

>>> k = Key(b)
>>> k.key = 'yourfile'
>>> k.set_contents_from_filename('yourfile.txt')

You can find more information here http://boto.cloudhackers.com/s3_tut.html#storing-data

您可以在此处找到更多信息http://boto.cloudhackers.com/s3_tut.html#storing-data

Answer 3

回答by Joe Drumgoole

http://s3tools.org/s3cmdworks pretty well and support the s3:// form of the URL structure you want. It does the business on Linux and Windows. If you need a native API to call from within a python program then http://code.google.com/p/boto/is a better choice.

http://s3tools.org/s3cmd运行良好，并支持您想要的 URL 结构的 s3:// 形式。它在 Linux 和 Windows 上开展业务。如果您需要从 Python 程序中调用本机 API，那么http://code.google.com/p/boto/是更好的选择。

Answer 4

回答by Skylar Saveland

For opening, it should be as simple as:

对于打开，它应该像这样简单：

import urllib
opener = urllib.URLopener()
myurl = "https://s3.amazonaws.com/skyl/fake.xyz"
myfile = opener.open(myurl)

This will work with s3 if the file is public.

如果文件是公开的，这将适用于 s3。

To write a file using boto, it goes a little something like this:

要使用 boto 编写文件，它有点像这样：

from boto.s3.connection import S3Connection
conn = S3Connection(AWS_KEY, AWS_SECRET)
bucket = conn.get_bucket(BUCKET)
destination = bucket.new_key()
destination.name = filename
destination.set_contents_from_file(myfile)
destination.make_public()

lemme know if this works for you :)

让我知道这是否适合你:)

Answer 5

回答by gene_wood

Here's how they doit in awscli:

以下是他们在awscli中的做法：

def find_bucket_key(s3_path):
    """
    This is a helper function that given an s3 path such that the path is of
    the form: bucket/key
    It will return the bucket and the key represented by the s3 path
    """
    s3_components = s3_path.split('/')
    bucket = s3_components[0]
    s3_key = ""
    if len(s3_components) > 1:
        s3_key = '/'.join(s3_components[1:])
    return bucket, s3_key


def split_s3_bucket_key(s3_path):
    """Split s3 path into bucket and key prefix.
    This will also handle the s3:// prefix.
    :return: Tuple of ('bucketname', 'keyname')
    """
    if s3_path.startswith('s3://'):
        s3_path = s3_path[5:]
    return find_bucket_key(s3_path)

Which you could just use with code like this

您可以将其与这样的代码一起使用

from awscli.customizations.s3.utils import split_s3_bucket_key
import boto3
client = boto3.client('s3')
bucket_name, key_name = split_s3_bucket_key(
    's3://example-bucket-name/path/to/example.txt')
response = client.get_object(Bucket=bucket_name, Key=key_name)

This doesn't address the goal of interacting with an s3 key as a file like objectbut it's a step in that direction.

这并没有解决将 s3 密钥作为文件之类的对象进行交互的目标，但这是朝着这个方向迈出的一步。

Answer 6

回答by Guilherme Freitas

Try s3fs

试试s3fs

First example on the docs:

文档上的第一个示例：

>>> import s3fs
>>> fs = s3fs.S3FileSystem(anon=True)
>>> fs.ls('my-bucket')
['my-file.txt']
>>> with fs.open('my-bucket/my-file.txt', 'rb') as f:
...     print(f.read())
b'Hello, world'

如何使用 url 访问 Python 中的 s3 文件？

提问by Nate Reed

回答by David Wolever

回答by Anto Binish Kaspar

回答by Joe Drumgoole

回答by Skylar Saveland

回答by gene_wood

回答by Guilherme Freitas

相关推荐

最近更新

标签

如何使用 url 访问 Python 中的 s3 文件？

提问by Nate Reed

回答by David Wolever

回答by Anto Binish Kaspar

回答by Joe Drumgoole

回答by Skylar Saveland

回答by gene_wood

回答by Guilherme Freitas

相关推荐

Python 具有多个组的正则表达式？

Python 让 Javascript 做列表理解

如何在 Python 中使用子进程重定向输出？

Python BeautifulSoup 和 lxml.html - 更喜欢什么？

相关推荐

最近更新

标签