Python:如何快速复制文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22078621/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:11:44  来源:igfitidea点击:

Python: How to Copy Files Fast

pythonshutilfile-copying

提问by alphanumeric

It takes at least 3 times longer to copy files with shutil.copyfile()versus to a regular right-click-copy > right-click-paste using Windows File Explorer or Mac's Finder. Is there any faster alternative to shutil.copyfile()in Python? What could be done to speed up a file copying process? (The files destination is on the network drive... if it makes any difference...).

shutil.copyfile()与使用 Windows 文件资源管理器或 Mac 的 Finder 进行常规右键单击复制 > 右键单击​​粘贴相比,复制文件所需的时间至少要长 3 倍。有没有比shutil.copyfile()Python更快的替代品?可以做些什么来加快文件复制过程?(文件目的地在网络驱动器上......如果它有什么不同......)。

EDITED LATER:

稍后编辑:

Here is what I have ended up with:

这是我最终得到的结果:

def copyWithSubprocess(cmd):        
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

win=mac=False
if sys.platform.startswith("darwin"):mac=True
elif sys.platform.startswith("win"):win=True

cmd=None
if mac: cmd=['cp', source, dest]
elif win: cmd=['xcopy', source, dest, '/K/O/X']

if cmd: copyWithSubprocess(cmd)

采纳答案by Dmytro

The fastest version w/o overoptimizing the code I've got with the following code:

最快的版本没有过度优化我使用以下代码获得的代码:

class CTError(Exception):
    def __init__(self, errors):
        self.errors = errors

try:
    O_BINARY = os.O_BINARY
except:
    O_BINARY = 0
READ_FLAGS = os.O_RDONLY | O_BINARY
WRITE_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC | O_BINARY
BUFFER_SIZE = 128*1024

def copyfile(src, dst):
    try:
        fin = os.open(src, READ_FLAGS)
        stat = os.fstat(fin)
        fout = os.open(dst, WRITE_FLAGS, stat.st_mode)
        for x in iter(lambda: os.read(fin, BUFFER_SIZE), ""):
            os.write(fout, x)
    finally:
        try: os.close(fin)
        except: pass
        try: os.close(fout)
        except: pass

def copytree(src, dst, symlinks=False, ignore=[]):
    names = os.listdir(src)

    if not os.path.exists(dst):
        os.makedirs(dst)
    errors = []
    for name in names:
        if name in ignore:
            continue
        srcname = os.path.join(src, name)
        dstname = os.path.join(dst, name)
        try:
            if symlinks and os.path.islink(srcname):
                linkto = os.readlink(srcname)
                os.symlink(linkto, dstname)
            elif os.path.isdir(srcname):
                copytree(srcname, dstname, symlinks, ignore)
            else:
                copyfile(srcname, dstname)
            # XXX What about devices, sockets etc.?
        except (IOError, os.error), why:
            errors.append((srcname, dstname, str(why)))
        except CTError, err:
            errors.extend(err.errors)
    if errors:
        raise CTError(errors)

This code runs a little bit slower than native linux "cp -rf".

这段代码比原生 linux “cp -rf”运行得慢一点。

Comparing to shutil the gain for the local storage to tmfps is around 2x-3x and around than 6x for NFS to local storage.

与shutil 相比,本地存储与tmfps 的增益约为2x-3x,而NFS 与本地存储的增益约为6x。

After profiling I've noticed that shutil.copy does lots of fstat syscals which are pretty heavyweight. If one want to optimize further I would suggest to do a single fstat for src and reuse the values. Honestly I didn't go further as I got almost the same figures as native linux copy tool and optimizing for several hundrends of milliseconds wasn't my goal.

在分析之后,我注意到shutil.copy 做了很多非常重量级的fstat syscals。如果想要进一步优化,我建议为 src 做一个 fstat 并重用这些值。老实说,我并没有走得更远,因为我得到了与原生 linux 复制工具几乎相同的数字,并且优化数百毫秒并不是我的目标。

回答by Joran Beasley

this is just a guess but ... your timing it wrong... that is when you copy the file it opens the file and reads it all into memory so that when you paste you only create a file and dump your memory contents

这只是一个猜测,但是......你的时间错了......那就是当你复制文件时,它会打开文件并将其全部读入内存,这样当你粘贴时你只会创建一个文件并转储你的内存内容

in python

在蟒蛇

copied_file = open("some_file").read()

is the equivelent of the ctrl+ ccopy

ctrl+c副本的等价物

then

然后

with open("new_file","wb") as f:
     f.write(copied_file)

is the equivelent of the ctrl+ vpaste (so time that for equivelency ....)

ctrl+ vpaste的等价物(所以时间是等价物......)

if you want it to be more scalable to larger data (but its not going to be as fast as ctrl+v /ctrl+c

如果您希望它对更大的数据更具可扩展性(但它不会像 ctrl+v /ctrl+c 那样快

with open(infile,"rb") as fin,open(outfile,"wb") as fout:
     fout.writelines(iter(fin.readline,''))

回答by Michael Burns

You could simply just use the OS you are doing the copy on, for Windows:

对于 Windows,您可以简单地使用正在执行复制的操作系统:

from subprocess import call
call(["xcopy", "c:\file.txt", "n:\folder\", "/K/O/X"])

/K - Copies attributes. Typically, Xcopy resets read-only attributes
/O - Copies file ownership and ACL information.
/X - Copies file audit settings (implies /O).

/K - 复制属性。通常,Xcopy 会重置只读属性
/O - 复制文件所有权和 ACL 信息。
/X - 复制文件审核设置(暗示 /O)。

回答by alphanumeric

import sys
import subprocess

def copyWithSubprocess(cmd):        
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

cmd=None
if sys.platform.startswith("darwin"): cmd=['cp', source, dest]
elif sys.platform.startswith("win"): cmd=['xcopy', source, dest, '/K/O/X']

if cmd: copyWithSubprocess(cmd)