python 查找目录中最旧的文件(递归)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/837606/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find the oldest file (recursively) in a directory
提问by Rowan Parker
I'm writing a Python backup script and I need to find the oldest file in a directory (and its sub-directories). I also need to filter it down to *.avi files only.
我正在编写一个 Python 备份脚本,我需要在目录(及其子目录)中找到最旧的文件。我还需要将其过滤为仅 *.avi 文件。
The script will always be running on a Linux machine. Is there some way to do it in Python or would running some shell commands be better?
该脚本将始终在 Linux 机器上运行。有没有办法在 Python 中做到这一点,或者运行一些 shell 命令会更好吗?
At the moment I'm running df
to get the free space on a particular partition, and if there is less than 5 gigabytes free, I want to start deleting the oldest *.avi
files until that condition is met.
目前我正在运行df
以获取特定分区上的可用空间,如果可用空间少于 5 GB,我想开始删除最旧的*.avi
文件,直到满足该条件。
回答by tzot
Hm. Nadia's answer is closer to what you meantto ask; however, for finding the (single) oldest file in a tree, try this:
嗯。Nadia 的回答更接近你的意思;但是,要在树中查找(单个)最旧的文件,请尝试以下操作:
import os
def oldest_file_in_tree(rootfolder, extension=".avi"):
return min(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
With a little modification, you can get the n
oldest files (similar to Nadia's answer):
稍加修改,您可以获得n
最旧的文件(类似于 Nadia 的回答):
import os, heapq
def oldest_files_in_tree(rootfolder, count=1, extension=".avi"):
return heapq.nsmallest(count,
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
Note that using the .endswith
method allows calls as:
请注意,使用该.endswith
方法允许调用为:
oldest_files_in_tree("/home/user", 20, (".avi", ".mov"))
to select more than one extension.
选择多个分机。
Finally, should you want the complete list of files, ordered by modification time, in order to delete as many as required to free space, here's some code:
最后,如果您想要完整的文件列表,按修改时间排序,以便根据需要删除尽可能多的空间,这里有一些代码:
import os
def files_to_delete(rootfolder, extension=".avi"):
return sorted(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime),
reverse=True)
and note that the reverse=True
brings the oldest files at the end of the list, so that for the next file to delete, you just do a file_list.pop()
.
并注意reverse=True
将最旧的文件放在列表的末尾,以便删除下一个文件,您只需执行file_list.pop()
.
By the way, for a complete solution to your issue, since you are running on Linux, where the os.statvfs
is available, you can do:
顺便说一下,要完整解决您的问题,因为您在 Linux 上运行,在os.statvfs
可用的地方,您可以执行以下操作:
import os
def free_space_up_to(free_bytes_required, rootfolder, extension=".avi"):
file_list= files_to_delete(rootfolder, extension)
while file_list:
statv= os.statvfs(rootfolder)
if statv.f_bfree*statv.f_bsize >= free_bytes_required:
break
os.remove(file_list.pop())
statvfs.f_bfree
are the device free blocks and statvfs.f_bsize
is the block size. We take the rootfolder
statvfs, so mind any symbolic links pointing to other devices, where we could delete many files without actually freeing up space in this device.
statvfs.f_bfree
是设备空闲块,statvfs.f_bsize
是块大小。我们使用rootfolder
statvfs,因此请注意指向其他设备的任何符号链接,我们可以在其中删除许多文件,而无需实际释放该设备中的空间。
UPDATE (copying a comment by Juan):
更新(复制胡安的评论):
Depending on the OS and filesystem implementation, you may want to multiply f_bfree by f_frsize rather than f_bsize. In some implementations, the latter is the preferred I/O request size. For example, on a FreeBSD 9 system I just tested, f_frsize was 4096 and f_bsize was 16384. POSIX says the block count fields are "in units of f_frsize" ( see http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html)
根据操作系统和文件系统实现,您可能希望将 f_bfree 乘以 f_frsize 而不是 f_bsize。在一些实现中,后者是首选的 I/O 请求大小。例如,在我刚刚测试的 FreeBSD 9 系统上,f_frsize 为 4096,f_bsize 为 16384。POSIX 表示块计数字段“以 f_frsize 为单位”(参见http://pubs.opengroup.org/onlinepubs/9699919799// basedefs/sys_statvfs.h.html)
回答by dF.
To do it in Python, you can use os.walk(path)
to iterate recursively over the files, and the st_size
and st_mtime
attributes of os.stat(filename)
to get the file sizes and modification times.
要在 Python 中执行此操作,您可以使用os.walk(path)
递归遍历文件以及 的st_size
和st_mtime
属性os.stat(filename)
来获取文件大小和修改时间。
回答by Nadia Alramli
You can use statand fnmatchmodules together to find the files
ST_MTIME refere to the last modification time. You can choose another value if you want
ST_MTIME 指最后修改时间。如果需要,您可以选择其他值
import os, stat, fnmatch
file_list = []
for filename in os.listdir('.'):
if fnmatch.fnmatch(filename, '*.avi'):
file_list.append((os.stat(filename)[stat.ST_MTIME], filename))
Then you can order the list by time and delete according to it.
然后您可以按时间对列表进行排序并根据它进行删除。
file_list.sort(key=lambda a: a[0])
回答by John T
I think the easiest way to do this would be to use find along with ls -t (sort files by time).
我认为最简单的方法是使用 find 和 ls -t (按时间排序文件)。
something along these lines should do the trick (deletes oldest avi file under specified directory)
沿着这些路线的东西应该可以解决问题(删除指定目录下最旧的avi文件)
find / -name "*.avi" | xargs ls -t | tail -n 1 | xargs rm
step by step....
一步步....
find / -name "*.avi"- find all avi files recursively starting at the root directory
find / -name "*.avi"- 从根目录开始递归查找所有 avi 文件
xargs ls -t- sort all files found by modification time, from newest to oldest.
xargs ls -t- 按修改时间对找到的所有文件进行排序,从最新到最旧。
tail -n 1- grab the last file in the list (oldest)
tail -n 1- 获取列表中的最后一个文件(最旧的)
xargs rm- and remove it
xargs rm- 并删除它
回答by tom10
Here's another Python formulation, which a bit old-school compared to some others, but is easy to modify, and handles the case of no matching files without raising an exception.
这是另一个 Python 公式,与其他一些公式相比,它有点老派,但易于修改,并且可以处理没有匹配文件的情况而不会引发异常。
import os
def find_oldest_file(dirname="..", extension=".avi"):
oldest_file, oldest_time = None, None
for dirpath, dirs, files in os.walk(dirname):
for filename in files:
file_path = os.path.join(dirpath, filename)
file_time = os.stat(file_path).st_mtime
if file_path.endswith(extension) and (file_time<oldest_time or oldest_time is None):
oldest_file, oldest_time = file_path, file_time
return oldest_file, oldest_time
print find_oldest_file()
回答by Michael Haren
Check out the linux command find
.
查看 linux 命令find
。
Alternatively, this postpipes together ls and tail to delete the oldest file in a directory. That could be done in a loop while there isn't enough free space.
或者,这篇文章将 ls 和 tail 连接在一起以删除目录中最旧的文件。这可以在没有足够可用空间的情况下循环完成。
For reference, here's the shell code that does it (follow the link for more alternatives and a discussion):
作为参考,这是执行此操作的 shell 代码(点击链接了解更多替代方法和讨论):
ls -t -r -1 /path/to/files | head --lines 1 | xargs rm
回答by Parappa
The os moduleprovides the functions that you need to get directory listings and file info in Python. I've found os.walkto be especially useful for walking directories recursively, and os.stat will give you detailed info (including modification time) on each entry.
该os模块提供了你需要得到目录列表和Python中的文件信息的功能。我发现os.walk对于递归遍历目录特别有用,并且 os.stat 将为您提供每个条目的详细信息(包括修改时间)。
You may be able to do this easier with a simple shell command. Whether that works better for you or not depends on what you want to do with the results.
您可以使用简单的 shell 命令更轻松地完成此操作。这是否对您更有效取决于您想对结果做什么。