从 HTTP 中转义 Python 字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/780334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Unescape Python Strings From HTTP
提问by Ian
I've got a string from an HTTP header, but it's been escaped.. what function can I use to unescape it?
我有一个来自 HTTP 标头的字符串,但它已被转义.. 我可以使用什么函数来取消它?
myemail%40gmail.com -> [email protected]
Would urllib.unquote() be the way to go?
urllib.unquote() 会是要走的路吗?
回答by Paolo Bergantino
I am pretty sure that urllib's unquote
is the common way of doing this.
我很确定 urllibunquote
是执行此操作的常用方法。
>>> import urllib
>>> urllib.unquote("myemail%40gmail.com")
'[email protected]'
There's also unquote_plus
:
还有unquote_plus
:
Like unquote(), but also replaces plus signs by spaces, as required for unquoting HTML form values.
与 unquote() 类似,但也将加号替换为空格,这是取消引用 HTML 表单值所需的。
回答by las3rjock
Yes, it appears that urllib.unquote()
accomplishes that task. (I tested it against your example on codepad.)
是的,它似乎urllib.unquote()
完成了这项任务。(我在 codepad 上针对您的示例对其进行了测试。)
回答by Antti Haapala
In Python 3, these functions are urllib.parse.unquote
and urllib.parse.unquote_plus
.
在 Python 3 中,这些函数是urllib.parse.unquote
和urllib.parse.unquote_plus
。
The latter is used for example for query strings in the HTTP URLs, where the space characters () are traditionally encoded as plus character (
+
), and the +
is percent-encoded to %2B
.
后者用于例如 HTTP URL 中的查询字符串,其中空格字符 ( ) 传统上编码为加号字符 (
+
),而+
百分比编码为%2B
。
In addition to these there is the unquote_to_bytes
that converts the given encoded string to bytes
, which can be used when the encoding is not known or the encoded data is binary data. However there is no unquote_plus_to_bytes
, if you need it, you can do:
除了这些之外,还有unquote_to_bytes
将给定的编码字符串转换为bytes
,当编码未知或编码数据是二进制数据时可以使用它。但是没有unquote_plus_to_bytes
,如果你需要它,你可以这样做:
def unquote_plus_to_bytes(s):
if isinstance(s, bytes):
s = s.replace(b'+', b' ')
else:
s = s.replace('+', ' ')
return unquote_to_bytes(s)
More information on whether to use unquote
or unquote_plus
is available at URL encoding the space character: + or %20.
有关是否使用unquote
或unquote_plus
可在URL 编码空格字符的更多信息: + 或 %20。