Python 你如何按数字对文件进行排序?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4623446/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 16:34:13  来源:igfitidea点击:

How do you sort files numerically?

pythonsorting

提问by Zach Young

First off, I'm posting this because when I was looking for a solution to the problem below, I could not find one on stackoverflow. So, I'm hoping to add a little bit to the knowledge base here.

首先,我发布这个是因为当我在寻找下面问题的解决方案时,我在 stackoverflow 上找不到一个解决方案。所以,我希望在这里添加一点知识库。

I need to process some files in a directory and need the files to be sorted numerically. I found some examples on sorting--specifically with using the lambdapattern--at wiki.python.org, and I put this together:

我需要处理目录中的一些文件,并需要按数字对文件进行排序。我lambdawiki.python.org找到了一些关于排序的例子——特别是使用模式——我把它们放在一起:

#!env/python
import re

tiffFiles = """ayurveda_1.tif
ayurveda_11.tif
ayurveda_13.tif
ayurveda_2.tif
ayurveda_20.tif
ayurveda_22.tif""".split('\n')

numPattern = re.compile('_(\d{1,2})\.', re.IGNORECASE)

tiffFiles.sort(cmp, key=lambda tFile:
                   int(numPattern.search(tFile).group(1)))

print tiffFiles

I'm still rather new to Python and would like to ask the community if there are any improvements that can be made to this: shortening the code up (removing lambda), performance, style/readability?

我对 Python 还是比较陌生,想问社区是否可以对此进行任何改进:缩短代码(删除lambda)、性能、样式/可读性?

Thank you, Zachary

谢谢你,扎卡里

采纳答案by Daniel DiPaolo

This is called "natural sorting" or "human sorting" (as opposed to lexicographical sorting, which is the default). Ned B wrote up a quick version of one.

这称为“自然排序”或“人工排序”(与默认的字典排序相反)。 Ned B 写了一个快速版本。

import re

def tryint(s):
    try:
        return int(s)
    except:
        return s

def alphanum_key(s):
    """ Turn a string into a list of string and number chunks.
        "z23a" -> ["z", 23, "a"]
    """
    return [ tryint(c) for c in re.split('([0-9]+)', s) ]

def sort_nicely(l):
    """ Sort the given list in the way that humans expect.
    """
    l.sort(key=alphanum_key)

It's similar to what you're doing, but perhaps a bit more generalized.

它类似于您正在做的事情,但可能更笼统一些。

回答by Don O'Donnell

If you are using key=in your sort method you shouldn't use cmpwhich has been removed from the latest versions of Python. keyshould be equated to a function which takes a record as input and returns any object which will compare in the order you want your list sorted. It doesn't need to be a lambda function and might be clearer as a stand alone function. Also regular expressions can be slow to evaluate.

如果您key=在排序方法中使用cmp,则不应使用已从最新版本的 Python 中删除的方法。 key应该等同于一个函数,它将记录作为输入并返回任何对象,该对象将按照您希望列表排序的顺序进行比较。它不需要是一个 lambda 函数,作为一个独立的函数可能会更清晰。此外,正则表达式的计算速度可能很慢。

You could try something like the following to isolate and return the integer part of the file name:

您可以尝试类似以下操作来隔离并返回文件名的整数部分:

def getint(name):
    basename = name.partition('.')
    alpha, num = basename.split('_')
    return int(num)
tiffiles.sort(key=getint)

回答by Prabhath Kota

Partition results in Tuple

元组中的分区结果

def getint(name):
    (basename, part, ext) = name.partition('.')
    (alpha, num) = basename.split('_')
    return int(num)

回答by dkmatt0

Just use :

只需使用:

tiffFiles.sort(key=lambda var:[int(x) if x.isdigit() else x for x in re.findall(r'[^0-9]|[0-9]+', var)])

is faster than use try/except.

比使用 try/except 更快。

回答by StatsSorceress

This is a modified version of @Don O'Donnell's answer, because I couldn't get it working as-is, but I think it's the best answer here as it's well-explained.

这是@Don O'Donnell 答案的修改版本,因为我无法让它按原样工作,但我认为这是最好的答案,因为它已经得到了很好的解释。

def getint(name):
    _, num = name.split('_')
    num, _ = num.split('.')
    return int(num)

print(sorted(tiffFiles, key=getint))

Changes:

变化:

1) The alphastring doesn't get stored, as it's not needed (hence _, num)

1)alpha字符串不会被存储,因为它不需要(因此_, num

2) Use num.split('.')to separate the number from .tiff

2)num.split('.')用于将数字与 .tiff 分开

3) Use sortedinstead of list.sort, per https://docs.python.org/2/howto/sorting.html

3) 使用sorted代替list.sort,根据https://docs.python.org/2/howto/sorting.html