如何使用 Python 对用于 URL 的字符串进行编码和解码?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/875771/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does one encode and decode a string with Python for use in a URL?
提问by un33k
I have a string like this:
我有一个这样的字符串:
String A: [ 12234_1_Hello'World_34433_22acb_4554344_accCC44 ]
I would like to encrypt String A to be used in a clean URL. something like this:
我想加密字符串 A 以在干净的 URL 中使用。像这样:
String B: [ cYdfkeYss4543423sdfHsaaZ ]
Is there a encode API in python, given String A, it returns String B? Is there a decode API in python, given String B, it returns String A?
python中是否有编码API,给定字符串A,它返回字符串B?python中是否有解码API,给定字符串B,它返回字符串A?
回答by
note that theres a huge difference between encoding and encryption.
请注意,编码和加密之间存在巨大差异。
if you want to send sensitive data, then dont use the encoding mentioned above ;)
如果您想发送敏感数据,请不要使用上面提到的编码;)
回答by Daniel Wedlund
One way of doing the encode/decode is to use the package base64, for an example:
进行编码/解码的一种方法是使用 base64 包,例如:
import base64
import sys
encoded = base64.b64encode(sys.stdin.read())
print encoded
decoded = base64.b64decode(encoded)
print decoded
Is it what you were looking for? With your particular case you get:
是你要找的吗?对于您的特定情况,您会得到:
input: 12234_1_Hello'World_34433_22acb_4554344_accCC44
输入:12234_1_Hello'World_34433_22acb_4554344_accCC44
encoded: MTIyMzRfMV9IZWxsbydXb3JsZF8zNDQzM18yMmFjYl80NTU0MzQ0X2FjY0NDNDQ=
编码: MTIyMzRfMV9IZWxsbydXb3JsZF8zNDQzM18yMmFjYl80NTU0MzQ0X2FjY0NDNDQ=
decoded: 12234_1_Hello'World_34433_22acb_4554344_accCC44
解码:12234_1_Hello'World_34433_22acb_4554344_accCC44
回答by JimG
Are you after encryption, compression, or just urlencoding? The string can be passed after urlencoding, but that will not make it smaller as in your example. Compression might shrink it, but you would still need to urlencode the result.
您是在进行加密、压缩还是只是进行 urlencoding?字符串可以在 urlencoding 之后传递,但这不会像你的例子那样变小。压缩可能会缩小它,但您仍然需要对结果进行 urlencode。
Do you actually need to hide the string data from the viewer (e.g. sensitive data, should not be viewable by someone reading the URL over your shoulder)?
您是否真的需要对查看者隐藏字符串数据(例如,敏感数据,不应让阅读 URL 的人看到)?
回答by viraptor
To make it really short -> just insert a row into the database. Store something like a list of (id auto_increment, url)
tuples. Then you can base64
encode the id to get a "proxy url". Decode it by decoding the id and looking up the proper url in the database. Or if you don't mind the identifiers looking sequential, just use the numbers.
为了使它真正简短-> 只需在数据库中插入一行。存储类似(id auto_increment, url)
元组列表的内容。然后您可以base64
对 id 进行编码以获得“代理 url”。通过解码 id 并在数据库中查找正确的 url 来解码它。或者,如果您不介意标识符看起来是连续的,只需使用数字即可。
回答by dF.
Are you looking to encrypt the string or encode it to remove illegal characters for urls?
If the latter, you can use urllib.quote
:
您是要加密字符串还是对其进行编码以删除网址的非法字符?如果是后者,您可以使用urllib.quote
:
>>> quoted = urllib.quote("12234_1_Hello'World_34433_22acb_4554344_accCC44")
>>> quoted
'12234_1_Hello%27World_34433_22acb_4554344_accCC44'
>>> urllib.unquote(quoted)
"12234_1_Hello'World_34433_22acb_4554344_accCC44"
回答by Brian Ramsay
The base64 module provides encoding and decoding for a string to and from different bases, since python 2.4.
从 python 2.4 开始,base64 模块为不同基数的字符串提供编码和解码。
In you example, you would do the following:
在您的示例中,您将执行以下操作:
import base64
string_b = base64.b64encode(string_a)
string_a = base64.b64decode(string_b)
For full API: http://docs.python.org/library/base64.html
回答by S.Lott
It's hard to reduce the size of a string and preserve arbitrary content.
很难减小字符串的大小并保留任意内容。
You have to restrict the data to something you can usefully compress.
您必须将数据限制为可以有效压缩的内容。
Your alternative is to do the following.
您的替代方法是执行以下操作。
Save "all the arguments in the URL" in a database row.
Assign a GUID key to this collection of arguments.
Then provide that shortened GUID key.
将“URL 中的所有参数”保存在数据库行中。
为这个参数集合分配一个 GUID 键。
然后提供缩短的 GUID 密钥。
回答by chradcliffe
Another method that would also shorten the string would be to calculate the md5/sha1 hash of the string (concatenated with a seed if you wished):
另一种缩短字符串的方法是计算字符串的 md5/sha1 哈希值(如果您愿意,还可以与种子连接):
import hashlib
>>> hashlib.sha1("12234_1_Hello'World_34433_22acb_4554344_accCC44").hexdigest()
'e1153227558aadc00a2e90b5013fdd6b0804fdfb'
In theory you should get a set of strings with very few collisions and with a fixed length. The hashlib
library has an array of different hash functions you can use in this manner, with different output sizes.
从理论上讲,您应该得到一组碰撞很少且长度固定的字符串。该hashlib
库有一系列不同的散列函数,您可以以这种方式使用,具有不同的输出大小。
Edit: You also said that you needed a reversible string, so this wouldn't work for that. Afaik, however, many web platforms that use clean URLs like you seem to want to implement use hash functions to calculate a shortened URL and then store that URL along with the page's other data to provide the reverse lookup capability.
编辑:你还说你需要一个可逆的字符串,所以这不起作用。然而,Afaik,许多使用像您一样的干净 URL 的网络平台似乎想要实现使用哈希函数来计算缩短的 URL,然后将该 URL 与页面的其他数据一起存储以提供反向查找功能。