Python，如何解析字符串看起来像 sys.argv

Question

提问by Gregg Lind

I would like to parse a string like this:

我想解析这样的字符串：

-o 1  --long "Some long string"

into this:

进入这个：

["-o", "1", "--long", 'Some long string']

or similar.

或类似。

This is different than either getopt, or optparse, which startwith sys.argv parsed input (like the output I have above). Is there a standard way to do this? Basically, this is "splitting" while keeping quoted strings together.

这与 getopt 或 optparse 不同，后者以 sys.argv 解析的输入开始（如我上面的输出）。有没有标准的方法来做到这一点？基本上，这是“拆分”，同时将引用的字符串保持在一起。

My best function so far:

到目前为止我最好的功能：

import csv
def split_quote(string,quotechar='"'):
    '''

    >>> split_quote('--blah "Some argument" here')
    ['--blah', 'Some argument', 'here']

    >>> split_quote("--blah 'Some argument' here", quotechar="'")
    ['--blah', 'Some argument', 'here']
    '''
    s = csv.StringIO(string)
    C = csv.reader(s, delimiter=" ",quotechar=quotechar)
    return list(C)[0]

Answer 1

回答by Jacob Gabrielson

I believe you want the shlexmodule.

我相信你想要shlex模块。

>>> import shlex
>>> shlex.split('-o 1 --long "Some long string"')
['-o', '1', '--long', 'Some long string']

Answer 2

回答by Craig McQueen

Before I was aware of shlex.split, I made the following:

在我意识到之前shlex.split，我做了以下事情：

import sys

_WORD_DIVIDERS = set((' ', '\t', '\r', '\n'))

_QUOTE_CHARS_DICT = {
    '\':   '\',
    ' ':    ' ',
    '"':    '"',
    'r':    '\r',
    'n':    '\n',
    't':    '\t',
}

def _raise_type_error():
    raise TypeError("Bytes must be decoded to Unicode first")

def parse_to_argv_gen(instring):
    is_in_quotes = False
    instring_iter = iter(instring)
    join_string = instring[0:0]

    c_list = []
    c = ' '
    while True:
        # Skip whitespace
        try:
            while True:
                if not isinstance(c, str) and sys.version_info[0] >= 3:
                    _raise_type_error()
                if c not in _WORD_DIVIDERS:
                    break
                c = next(instring_iter)
        except StopIteration:
            break
        # Read word
        try:
            while True:
                if not isinstance(c, str) and sys.version_info[0] >= 3:
                    _raise_type_error()
                if not is_in_quotes and c in _WORD_DIVIDERS:
                    break
                if c == '"':
                    is_in_quotes = not is_in_quotes
                    c = None
                elif c == '\':
                    c = next(instring_iter)
                    c = _QUOTE_CHARS_DICT.get(c)
                if c is not None:
                    c_list.append(c)
                c = next(instring_iter)
            yield join_string.join(c_list)
            c_list = []
        except StopIteration:
            yield join_string.join(c_list)
            break

def parse_to_argv(instring):
    return list(parse_to_argv_gen(instring))

This works with Python 2.x and 3.x. On Python 2.x, it works directly with byte strings and Unicode strings. On Python 3.x, it onlyaccepts [Unicode] strings, not bytesobjects.

这适用于 Python 2.x 和 3.x。在 Python 2.x 上，它直接处理字节字符串和 Unicode 字符串。在 Python 3.x 上，它只接受 [Unicode] 字符串，而不接受bytes对象。

This doesn't behave exactly the same as shell argv splitting—it also allows quoting of CR, LF and TAB characters as \r, \nand \t, converting them to real CR, LF, TAB (shlex.splitdoesn't do that). So writing my own function was useful for my needs. I guess shlex.splitis better if you just want plain shell-style argv splitting. I'm sharing this code in case it's useful as a baseline for doing something slightly different.

这与 shell argv 拆分的行为并不完全相同——它还允许将 CR、LF 和 TAB 字符引用为\r,\n和\t，将它们转换为真正的 CR、LF、TAB（shlex.split不这样做）。所以编写我自己的函数对我的需求很有用。shlex.split如果您只想要简单的 shell 样式的 argv 拆分，我想会更好。我正在分享这段代码，以防它作为做一些稍微不同的事情的基线有用。

Python，如何解析字符串看起来像 sys.argv

提问by Gregg Lind

回答by Jacob Gabrielson

回答by Craig McQueen

相关推荐

最近更新

标签

Python，如何解析字符串看起来像 sys.argv

提问by Gregg Lind

回答by Jacob Gabrielson

回答by Craig McQueen

相关推荐

Python：从自身内部获取对函数的引用

python Django ORM：选择相关集

python 我可以通过字典值/条目和键来运行吗

python 1 个 django 应用程序中约有 20 个模型

相关推荐

最近更新

标签