bash 对于 sys.argv[1:] 中的 fi:参数列表太长

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28965253/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 12:30:57  来源:igfitidea点击:

for fi in sys.argv[1:]: argument list too long

pythonbashargumentsargv

提问by adrCoder

I am trying to execute a python script on all text files in a folder:

我正在尝试对文件夹中的所有文本文件执行 python 脚本:

for fi in sys.argv[1:]:

And I get the following error

我收到以下错误

-bash: /usr/bin/python: Argument list too long

The way I call this Python function is the following:

我调用这个 Python 函数的方式如下:

python functionName.py *.txt

The folder has around 9000 files. Is there some way to run this function without having to split my data in more folders etc? Splitting the files would not be very practical because I will have to execute the function in even more files in the future... Thanks

该文件夹有大约 9000 个文件。有什么方法可以运行此功能而不必将我的数据拆分到更多文件夹等中?拆分文件不是很实用,因为我将来必须在更多文件中执行该功能......谢谢

EDIT: Based on the selected correct reply and the comments of the replier (Charles Duffy), what worked for me is the following:

编辑:根据选择的正确回复和回复者(Charles Duffy)的评论,对我有用的是以下内容:

printf '%s
find . -maxdepth 1 -type f -name '*.txt' -exec ./your-python-program '{}' +
' *.txt | xargs -0 python ./functionName.py

because I don't have a valid shebang..

因为我没有有效的shebang..

回答by Charles Duffy

This is an OS-level problem (limit on command line length), and is conventionally solved with an OS-level (or, at least, outside-your-Python-process) solution:

这是一个操作系统级别的问题(命令行长度限制),通常可以通过操作系统级别(或者,至少,在你的 Python 进程之外)解决方案来解决:

printf '%s
getconf ARG_MAX
' *.txt | xargs -0 ./your-python-program

...or...

...或者...

for fi in sys.argv[1:]

Note that this runs your-python-programonce per batch of files found, where the batch size is dependent on the number of names that can fit in ARG_MAX; see the excellent answer by Marcus Müller if this is unsuitable.

请注意,这会your-python-program为找到的每批文件运行一次,其中批大小取决于可以放入的名称数量ARG_MAX;如果这不合适,请参阅 Marcus Müller 的出色回答。

回答by Marcus Müller

No. That is a kernel limitation for the length (in bytes) of a command line.

不。这是对命令行长度(以字节为单位)的内核限制。

Typically, you can determine that limit by doing

通常,您可以通过执行以下操作来确定该限制

for fi in opts.file_to_read_filenames_from.read().split(chr(0))

which, at least for me, yields 2097152 (bytes), which means about 2MB.

至少对我来说,它产生 2097152(字节),这意味着大约 2MB。

I recommend using python to work through a folder yourself, i.e. giving your python program the ability to work with directories instead of individidual files, or to read file names from a file.

我建议使用 python 自己处理文件夹,即让你的 python 程序能够处理目录而不是单个文件,或者从文件中读取文件名。

The former can easily be done using os.walk(...), whereas the second option is (in my opinion) the more flexible one. Use the argparsemodule to give your python program an easy-to-use command line syntax, then add an argument of a file type (see reference documentation), and python will automatically be able to understand special filenames like -, meaning you could instead of

前者可以使用 轻松完成os.walk(...),而第二个选项(在我看来)更灵活。使用该argparse模块为您的 python 程序提供易于使用的命令行语法,然后添加一个文件类型的参数(请参阅参考文档),python 将自动能够理解特殊文件名,例如-,这意味着您可以代替

find -iname '*.txt' -type f -print0|my_python_program.py -file-to-read-filenames-from - 

do

python functionName.py "*.txt"

which would even allow you to do something like

这甚至可以让你做类似的事情

for fi in glob.glob(sys.argv[1]):
    ...

回答by iced

Don't do it this way. Pass mask to your python script (e.g. call it as python functionName.py "*.txt") and expand it using glob (https://docs.python.org/2/library/glob.html).

不要这样做。将掩码传递给您的 python 脚本(例如,将其称为python functionName.py "*.txt")并使用 glob ( https://docs.python.org/2/library/glob.html)展开它。

回答by Micha? Niklas

I think about using globmodule. With this module you invoke your program like:

我考虑使用glob模块。使用此模块,您可以调用您的程序,如:

##代码##

then shell will not expand *.txtinto file names. You Python program will receive *.txtin argumens list and you can pass it into glob.glob():

那么 shell 将不会扩展*.txt为文件名。您的 Python 程序将*.txt在参数列表中接收,您可以将其传递到glob.glob()

##代码##