如何使用python获取文件夹中的最新文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39327032/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:08:32  来源:igfitidea点击:

How to get the latest file in a folder using python

pythonpython-3.xpython-2.7

提问by garlapak

I need to get the latest file of a folder using python. While using the code:

我需要使用 python 获取文件夹的最新文件。使用代码时:

max(files, key = os.path.getctime)

I am getting the below error:

我收到以下错误:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'

回答by Marlon Abeykoon

Whatever is assigned to the filesvariable is incorrect. Use the following code.

分配给files变量的任何内容都是不正确的。使用以下代码。

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file

回答by glglgl

max(files, key = os.path.getctime)

is quite incomplete code. What is files? It probably is a list of file names, coming out of os.listdir().

是相当不完整的代码。什么是files?它可能是一个文件名列表,来自os.listdir().

But this list lists only the filename parts (a. k. a. "basenames"), because their path is common. In order to use it correctly, you have to combine it with the path leading to it (and used to obtain it).

但是这个列表只列出了文件名部分(又名“basenames”),因为它们的路径是通用的。为了正确使用它,您必须将它与通向它的路径结合起来(并用于获取它)。

Such as (untested):

例如(未经测试):

def newest(path):
    files = os.listdir(path)
    paths = [os.path.join(path, basename) for basename in files]
    return max(paths, key=os.path.getctime)

回答by BreakBadSP

I would suggest using glob.iglob()instead of the glob.glob(), as it is more efficient.

我建议使用glob.iglob()而不是glob.glob(),因为它更有效。

glob.iglob() Return an iterator which yields the same values as glob() without actually storing them all simultaneously.

glob.iglob() 返回一个迭代器,它产生与 glob() 相同的值,而不实际同时存储它们。

Which means glob.iglob()will be more efficient.

这意味着glob.iglob()效率会更高。

I mostly use below code to find the latest file matching to my pattern:

我主要使用以下代码来查找与我的模式匹配的最新文件:

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)



NOTE: There are variants of maxfunction, In case of finding the latest file we will be using below variant: max(iterable, *[, key, default])

注意:有max函数的变体,如果找到最新的文件,我们将使用以下变体: max(iterable, *[, key, default])

which needs iterable so your first parameter should be iterable. In case of finding max of nums we can use beow variant : max (num1, num2, num3, *args[, key])

需要迭代,所以你的第一个参数应该是可迭代的。在找到最大数量的情况下,我们可以使用以下变体:max (num1, num2, num3, *args[, key])

回答by turkus

Try to sort items by creation time. Example below sorts files in a folder and gets first element which is latest.

尝试按创建时间对项目进行排序。下面的示例对文件夹中的文件进行排序并获取最新的第一个元素。

import glob
import os

files_path = os.path.join(folder, '*')
files = sorted(
    glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print files[0]

回答by crlf

I lack the reputation to comment but ctime from Marlon Abeykoons response did not give the correct result for me. Using mtime does the trick though. (key=os.path.getmtime))

我缺乏评论的声誉,但来自 Marlon Abeykoons 的 ctime 回应并没有为我提供正确的结果。不过,使用 mtime 可以解决问题。(key=os.path.get m时间))

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print latest_file

I found two answers for that problem:

我找到了这个问题的两个答案:

python os.path.getctime max does not return latestDifference between python - getmtime() and getctime() in unix system

python os.path.getctime max 不返回最新的python-getmtime()和getctime()在unix系统中的区别

回答by ic_fl2

A much faster method on windows (0.05s), call a bat script that does this:

在 Windows 上更快的方法(0.05s),调用执行此操作的 bat 脚本:

get_latest.bat

get_latest.bat

@echo off
for /f %%i in ('dir \directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%

where \\directory\in\questionis the directory you want to investigate.

\\directory\in\question您要调查的目录在哪里。

get_latest.py

get_latest.py

from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)

if it finds a file stdoutis the path and stderris None.

如果它找到一个文件stdout是路径并且stderr是无。

Use stdout.decode("utf-8").rstrip()to get the usable string representation of the file name.

使用stdout.decode("utf-8").rstrip()来获取文件名使用字符串表示。

回答by AlexFink

I have tried to use the above suggestions and my program crashed, than I figured out the file I'm trying to identify was used and when trying to use 'os.path.getctime' it crashed. what finally worked for me was:

我尝试使用上述建议并且我的程序崩溃了,然后我发现我尝试识别的文件已被使用,并且在尝试使用“os.path.getctime”时它崩溃了。最终对我有用的是:

    files_before = glob.glob(os.path.join(my_path,'*'))
    **code where new file is created**
    new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))

this codes gets the uncommon object between the two sets of file lists its not the most elegant, and if multiple files are created at the same time it would probably won't be stable

此代码获取两组文件列表之间的不常见对象,它不是最优雅的,如果同时创建多个文件,它可能会不稳定

回答by Naeem Ul Wahhab

(Edited to improve answer)

(编辑以改进答案)

First define a function get_latest_file

首先定义一个函数get_latest_file

def get_latest_file(path, *paths):
    fullpath = os.path.join(path, paths)
    ...
get_latest_file('example', 'files','randomtext011.*.txt')

You may also use a docstring !

您也可以使用文档字符串!

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)

If you use Python 3, you can use iglobinstead.

如果您使用 Python 3,则可以改用iglob

Complete code to return the name of latest file:

返回最新文件名的完整代码:

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)
    files = glob.glob(fullpath)  # You may use iglob in Python3
    if not files:                # I prefer using the negation
        return None                      # because it behaves like a shortcut
    latest_file = max(files, key=os.path.getctime)
    _, filename = os.path.split(latest_file)
    return filename