Python 从目录参数中获取文件,按大小排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20252669/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:01:22  来源:igfitidea点击:

Get files from Directory Argument, Sorting by Size

python

提问by wadda_wadda

I'm trying to write a program that takes a command line argument, scans through the directory tree provided by the argument and creating a list of every file in the directory, and then sorting by length of files.

我正在尝试编写一个接受命令行参数的程序,扫描参数提供的目录树并创建目录中每个文件的列表,然后按文件长度排序。

I'm not much of a script-guy - but this is what I've got and it's not working:

我不是一个脚本专家 - 但这是我所拥有的,它不起作用:

import sys
import os
from os.path import getsize

file_list = []

#Get dirpath
dirpath = os.path.abspath(sys.argv[0])
if os.path.isdir(dirpath):
    #Get all entries in the directory
    for root, dirs, files in os.walk(dirpath):
        for name in files:
            file_list.append(name)
        file_list = sorted(file_list, key=getsize)
        for item in file_list:
            sys.stdout.write(str(file) + '\n')

else:
    print "not found"

Can anyone point me in the right direction?

任何人都可以指出我正确的方向吗?

采纳答案by J. Owens

Hopefully this function will help you out (I'm using Python 2.7):

希望这个函数能帮到你(我使用的是 Python 2.7):

import os    

def get_files_by_file_size(dirname, reverse=False):
    """ Return list of file paths in directory sorted by file size """

    # Get list of files
    filepaths = []
    for basename in os.listdir(dirname):
        filename = os.path.join(dirname, basename)
        if os.path.isfile(filename):
            filepaths.append(filename)

    # Re-populate list with filename, size tuples
    for i in xrange(len(filepaths)):
        filepaths[i] = (filepaths[i], os.path.getsize(filepaths[i]))

    # Sort list by file size
    # If reverse=True sort from largest to smallest
    # If reverse=False sort from smallest to largest
    filepaths.sort(key=lambda filename: filename[1], reverse=reverse)

    # Re-populate list with just filenames
    for i in xrange(len(filepaths)):
        filepaths[i] = filepaths[i][0]

    return filepaths

回答by Robert J?rgensgaard Engdahl

You are extracting the command and not the first argument with argv[0]; use argv[1]for that:

您正在提取命令而不是第一个参数argv[0];使用argv[1]为:

dirpath = sys.argv[1]  # argv[0] contains the command itself.

For performance reasons I suggest you prefetch the file sizes instead of asking the OS about the size of the same file multiple times during the sorting (as suggested by Koffein, os.walkis the way to go):

出于性能原因,我建议您预取文件大小,而不是在排序过程中多次询问操作系统关于同一文件的大小(正如 Koffein 所建议的那样os.walk):

files_list = []
for path, dirs, files in os.walk(dirpath)):
    files_list.extend([(os.path.join(path, file), getsize(os.path.join(path, file))) for file in files])

Assuming you don't need the unsorted list, we will use the in-place sort() method:

假设您不需要未排序的列表,我们将使用就地 sort() 方法:

files_list.sort(key=operator.itemgetter(1))

回答by koffein

This is a approach using generators. Should be faster for large number of files…

这是一种使用生成器的方法。对于大量文件应该更快......

This is the beginning of both examples:

这是两个示例的开头:

import os, operator, sys
dirpath = os.path.abspath(sys.argv[0])
# make a generator for all file paths within dirpath
all_files = ( os.path.join(basedir, filename) for basedir, dirs, files in os.walk(dirpath) for filename in files   )

If you just want a list of the files without the size, you can use this:

如果你只想要一个没有大小的文件列表,你可以使用这个:

sorted_files = sorted(all_files, key = os.path.getsize)

But if you want files and paths in a list, you can use this:

但是如果你想要一个列表中的文件和路径,你可以使用这个:

# make a generator for tuples of file path and size: ('/Path/to/the.file', 1024)
files_and_sizes = ( (path, os.path.getsize(path)) for path in all_files )
sorted_files_with_size = sorted( files_and_sizes, key = operator.itemgetter(1) )