Python Boto3 S3,按上次修改对桶进行排序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44574548/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Boto3 S3, sort bucket by last modified
提问by nate
I need to fetch a list of items from S3 using Boto3, but instead of returning default sort order (descending) I want it to return it via reverse order.
我需要使用 Boto3 从 S3 获取项目列表,但不是返回默认排序顺序(降序),我希望它通过相反的顺序返回它。
I know you can do it via awscli:
我知道你可以通过 awscli 做到:
aws s3api list-objects --bucket mybucketfoo --query "reverse(sort_by(Contents,&LastModified))"
and its doable via the UI console (not sure if this is done client side or server side)
并且可以通过 UI 控制台执行(不确定这是在客户端还是服务器端完成的)
I cant seem to see how to do this in Boto3.
我似乎无法在 Boto3 中看到如何做到这一点。
I am currently fetching all the files, and then sorting...but that seems overkill, especially if I only care about the 10 or so most recent files.
我目前正在获取所有文件,然后进行排序……但这似乎有点矫枉过正,尤其是如果我只关心 10 个左右的最新文件。
The filter system seems to only accept the Prefix for s3, nothing else.
过滤系统似乎只接受 s3 的前缀,没有别的。
采纳答案by nate
I did a small variation of what @helloV posted below. its not 100% optimum, but it gets the job done with the limitations boto3 has as of this time.
我对@helloV 在下面发布的内容做了一些小改动。它不是 100% 最佳的,但它完成了工作,但目前 boto3 有限制。
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
unsorted = []
for file in my_bucket.objects.filter():
unsorted.append(file)
files = [obj.key for obj in sorted(unsorted, key=get_last_modified,
reverse=True)][0:9]
回答by helloV
If there are not many objects in the bucket, you can use Python to sort it to your needs.
如果bucket中的对象不多,可以使用Python根据自己的需要进行排序。
Define a lambda to get the last modified time:
定义一个 lambda 来获取最后修改时间:
get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))
Get all objects and sort them by last modified time.
获取所有对象并按上次修改时间对它们进行排序。
s3 = boto3.client('s3')
objs = s3.list_objects_v2(Bucket='my_bucket')['Contents']
[obj['Key'] for obj in sorted(objs, key=get_last_modified)]
If you want to reverse the sort:
如果要反转排序:
[obj['Key'] for obj in sorted(objs, key=get_last_modified, reverse=True)]
回答by Juan Diego Garcia
it seems that is no way to do the sort by using boto3. According to the documentation, boto3 only supports these methods for Collections:
似乎无法通过使用 boto3 进行排序。根据文档,boto3 只支持集合的这些方法:
all(), filter(**kwargs), page_size(**kwargs), limit(**kwargs)
all(), filter(**kwargs), page_size(**kwargs), limit(**kwargs)
Hope this help in some way. https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.ServiceResource.buckets
希望这在某种程度上有所帮助。 https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.ServiceResource.buckets
回答by weegolo
A simpler approach, using the python3 sorted() function:
一种更简单的方法,使用 python3 sorted() 函数:
import boto3
s3 = boto3.resource('s3')
myBucket = s3.Bucket('name')
def obj_last_modified(myobj):
return myobj.last_modified
sortedObjects = sorted(myBucket.objects.all(), key=obj_last_modified, reverse=True)
you now have a reverse sorted list, sorted by the 'last_modified' attribute of each Object.
您现在有一个反向排序列表,按每个Object的 'last_modified' 属性排序。
回答by zalmane
Slight improvement of above:
以上略有改进:
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
files = my_bucket.objects.filter():
files = [obj.key for obj in sorted(files, key=lambda x: x.last_modified,
reverse=True)]
回答by Israelsofer
keys = []
kwargs = {'Bucket': 'my_bucket'}
while True:
resp = s3.list_objects_v2(**kwargs)
for obj in resp['Contents']:
keys.append(obj['Key'])
try:
kwargs['ContinuationToken'] = resp['NextContinuationToken']
except KeyError:
break
this will get you all the keys in a sorted order
这将使您按排序顺序获得所有键
回答by Nelson
s3 = boto3.client('s3')
get_last_modified = lambda obj: int(obj['LastModified'].strftime('%Y%m%d%H%M%S'))
def sortFindLatest(bucket_name):
resp = s3.list_objects(Bucket=bucket_name)
if 'Contents' in resp:
objs = resp['Contents']
files = sorted(objs, key=get_last_modified)
for key in files:
file = key['Key']
cx = s3.get_object(Bucket=bucket_name, Key=file)
This works for me to sort by date and time. I am using Python3 AWS lambda. Your mileage may vary. It can be optimized, I purposely made it discrete. As mentioned in an earlier post, 'reverse=True' can be added to change the sort order.
这对我来说可以按日期和时间排序。我正在使用 Python3 AWS lambda。你的旅费可能会改变。它可以优化,我特意让它离散。正如之前的帖子中提到的,可以添加“reverse=True”来更改排序顺序。