从 HTTP 中转义 Python 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/780334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 20:51:21  来源:igfitidea点击:

Unescape Python Strings From HTTP

pythonhttpheaderurllib2mod-wsgi

提问by Ian

I've got a string from an HTTP header, but it's been escaped.. what function can I use to unescape it?

我有一个来自 HTTP 标头的字符串,但它已被转义.. 我可以使用什么函数来取消它?

myemail%40gmail.com -> [email protected]

Would urllib.unquote() be the way to go?

urllib.unquote() 会是要走的路吗?

回答by Paolo Bergantino

I am pretty sure that urllib's unquoteis the common way of doing this.

我很确定 urllibunquote是执行此操作的常用方法。

>>> import urllib
>>> urllib.unquote("myemail%40gmail.com")
'[email protected]'

There's also unquote_plus:

还有unquote_plus

Like unquote(), but also replaces plus signs by spaces, as required for unquoting HTML form values.

与 unquote() 类似,但也将加号替换为空格,这是取消引用 HTML 表单值所需的。

回答by las3rjock

Yes, it appears that urllib.unquote()accomplishes that task. (I tested it against your example on codepad.)

是的,它似乎urllib.unquote()完成了这项任务。(我在 codepad 上针对您的示例对其进行了测试。)

回答by Antti Haapala

In Python 3, these functions are urllib.parse.unquoteand urllib.parse.unquote_plus.

在 Python 3 中,这些函数是urllib.parse.unquoteurllib.parse.unquote_plus

The latter is used for example for query strings in the HTTP URLs, where the space characters () are traditionally encoded as plus character (+), and the +is percent-encoded to %2B.

后者用于例如 HTTP URL 中的查询字符串,其中空格字符 ( ) 传统上编码为加号字符 ( +),而+百分比编码为%2B

In addition to these there is the unquote_to_bytesthat converts the given encoded string to bytes, which can be used when the encoding is not known or the encoded data is binary data. However there is no unquote_plus_to_bytes, if you need it, you can do:

除了这些之外,还有unquote_to_bytes将给定的编码字符串转换为bytes,当编码未知或编码数据是二进制数据时可以使用它。但是没有unquote_plus_to_bytes,如果你需要它,你可以这样做:

def unquote_plus_to_bytes(s):
    if isinstance(s, bytes):
        s = s.replace(b'+', b' ')
    else:
        s = s.replace('+', ' ')
    return unquote_to_bytes(s)


More information on whether to use unquoteor unquote_plusis available at URL encoding the space character: + or %20.

有关是否使用unquoteunquote_plus可在URL 编码空格字符的更多信息: + 或 %20