Linux Python FTP按日期获取最新文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8990598/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python FTP get the most recent file by date
提问by krisdigitx
I am using ftplib to connect to an ftp site. I want to get the most recently uploaded file and download it. I am able to connect to the ftp server and list the files, I also have put them in a list and got the datefield
converted. Is there any function/module which can get the recent date and output the whole line from the list?
我正在使用 ftplib 连接到 ftp 站点。我想获取最近上传的文件并下载它。我能够连接到 ftp 服务器并列出文件,我也将它们放在一个列表中并进行了datefield
转换。是否有任何功能/模块可以获取最近的日期并从列表中输出整行?
#!/usr/bin/env python
import ftplib
import os
import socket
import sys
HOST = 'test'
def main():
try:
f = ftplib.FTP(HOST)
except (socket.error, socket.gaierror), e:
print 'cannot reach to %s' % HOST
return
print "Connect to ftp server"
try:
f.login('anonymous','[email protected]')
except ftplib.error_perm:
print 'cannot login anonymously'
f.quit()
return
print "logged on to the ftp server"
data = []
f.dir(data.append)
for line in data:
datestr = ' '.join(line.split()[0:2])
orig-date = time.strptime(datestr, '%d-%m-%y %H:%M%p')
f.quit()
return
if __name__ == '__main__':
main()
RESOLVED:
解决:
data = []
f.dir(data.append)
datelist = []
filelist = []
for line in data:
col = line.split()
datestr = ' '.join(line.split()[0:2])
date = time.strptime(datestr, '%m-%d-%y %H:%M%p')
datelist.append(date)
filelist.append(col[3])
combo = zip(datelist,filelist)
who = dict(combo)
for key in sorted(who.iterkeys(), reverse=True):
print "%s: %s" % (key,who[key])
filename = who[key]
print "file to download is %s" % filename
try:
f.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
except ftplib.err_perm:
print "Error: cannot read file %s" % filename
os.unlink(filename)
else:
print "***Downloaded*** %s " % filename
return
f.quit()
return
One problem, is it possible to retrieve the first element from the dictionary? what I did here is that the for loop runs only once and exits thereby giving me the first sorted value which is fine, but I don't think it is a good practice to do it in this way..
一个问题,是否可以从字典中检索第一个元素?我在这里所做的是 for 循环只运行一次并退出,从而为我提供第一个排序的值,这很好,但我认为以这种方式执行它不是一个好习惯..
采纳答案by Paulo
with NLST, like Martin Prikryl response: you should use sorted method :
使用 NLST,就像 Martin Prikryl 回应:你应该使用 sorted 方法:
ftp = FTP(host="127.0.0.1", user="u",passwd="p")
ftp.cwd("/data")
file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]
回答by Rumple Stiltskin
If you have all the dates in time.struct_time
(strptime
will give you this) in a list then all you have to do is sort
the list.
如果您在列表中包含所有日期time.struct_time
(strptime
会给您这个),那么您所要做的就是sort
列表。
Here's an example :
这是一个例子:
#!/usr/bin/python
import time
dates = [
"Jan 16 18:35 2012",
"Aug 16 21:14 2012",
"Dec 05 22:27 2012",
"Jan 22 19:42 2012",
"Jan 24 00:49 2012",
"Dec 15 22:41 2012",
"Dec 13 01:41 2012",
"Dec 24 01:23 2012",
"Jan 21 00:35 2012",
"Jan 16 18:35 2012",
]
def main():
datelist = []
for date in dates:
date = time.strptime(date, '%b %d %H:%M %Y')
datelist.append(date)
print datelist
datelist.sort()
print datelist
if __name__ == '__main__':
main()
回答by Arthur Accioly
I don't know how it's your ftp, but your example was not working for me. I changed some lines related to the date sorting part:
我不知道你的 ftp 怎么样,但你的例子对我不起作用。我更改了与日期排序部分相关的一些行:
import sys
from ftplib import FTP
import os
import socket
import time
# Connects to the ftp
ftp = FTP(ftpHost)
ftp.login(yourUserName,yourPassword)
data = []
datelist = []
filelist = []
ftp.dir(data.append)
for line in data:
col = line.split()
datestr = ' '.join(line.split()[5:8])
date = time.strptime(datestr, '%b %d %H:%M')
datelist.append(date)
filelist.append(col[8])
combo = zip(datelist,filelist)
who = dict(combo)
for key in sorted(who.iterkeys(), reverse=True):
print "%s: %s" % (key,who[key])
filename = who[key]
print "file to download is %s" % filename
try:
ftp.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
except ftplib.err_perm:
print "Error: cannot read file %s" % filename
os.unlink(filename)
else:
print "***Downloaded*** %s " % filename
ftp.quit()
回答by Santi Oliveras
Why don't you use next dir option?
为什么不使用 next dir 选项?
ftp.dir('-t',data.append)
With this option the file listing is time ordered from newest to oldest. Then just retrieve the first file in the list to download it.
使用此选项,文件列表按时间顺序从最新到最旧。然后只需检索列表中的第一个文件即可下载。
回答by Martin Prikryl
For those looking for a full solution for finding the latest file in a folder:
对于那些正在寻找在文件夹中查找最新文件的完整解决方案的人:
MLSD
MLSD
If your FTP server supports MLSD
command, a solution is easy:
如果您的 FTP 服务器支持MLSD
命令,则解决方案很简单:
entries = list(ftp.mlsd())
entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
latest_name = entries[0][0]
print(latest_name)
LIST
列表
If you need to rely on an obsolete LIST
command, you have to parse a proprietary listing it returns.
如果您需要依赖过时的LIST
命令,则必须解析它返回的专有列表。
Common *nix listing is like:
常见的 *nix 列表如下:
-rw-r--r-- 1 user group 4467 Mar 27 2018 file1.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file2.zip
With a listing like this, this code will do:
有了这样的清单,此代码将执行以下操作:
from dateutil import parser
# ...
lines = []
ftp.dir("", lines.append)
latest_time = None
latest_name = None
for line in lines:
tokens = line.split(maxsplit = 9)
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
if (latest_time is None) or (time > latest_time):
latest_name = tokens[8]
latest_time = time
print(latest_name)
This is a rather fragile approach.
这是一种相当脆弱的方法。
MDTM
MDTM
A more reliable, but a way less efficient, is to use MDTM
command to retrieve timestamps of individual files/folders:
一种更可靠但效率较低的方法是使用MDTM
命令来检索单个文件/文件夹的时间戳:
names = ftp.nlst()
latest_time = None
latest_name = None
for name in names:
time = ftp.voidcmd("MDTM " + name)
if (latest_time is None) or (time > latest_time):
latest_name = name
latest_time = time
print(latest_name)
For an alternative version of the code, see the answer by @Paulo.
有关代码的替代版本,请参阅@Paulo 的答案。
Non-standard -t switch
非标准 -t 开关
Some FTP servers support a proprietary non-standard -t
switch for NLST
(or LIST
) command.
某些 FTP 服务器支持(或)命令的专有非标准-t
开关。NLST
LIST
lines = ftp.nlst("-t")
latest_name = lines[-1]
See How to get files in FTP folder sorted by modification time.
Downloading found file
下载找到的文件
No matter what approach you use, once you have the latest_name
, you download it as any other file:
不管你使用什么方法,一旦你有了latest_name
,你就可以像任何其他文件一样下载它:
file = open(latest_name, 'wb')
ftp.retrbinary('RETR '+ latest_name, file.write)