如何将 Python 2 unicode() 函数转换为正确的 Python 3.x 语法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38697037/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to convert Python 2 unicode() function into correct Python 3.x syntax
提问by guettli
I enabled the compatibility check in my Python IDE and now I realize that the inherited Python 2.7 code has a lot of calls to unicode()
which are not allowed in Python 3.x.
我在我的 Python IDE 中启用了兼容性检查,现在我意识到继承的 Python 2.7 代码有很多unicode()
在 Python 3.x 中不允许的调用。
I looked at the docsof Python2 and found no hint how to upgrade:
我查看了 Python2的文档,没有发现如何升级的提示:
I don't want to switch to Python3 now, but maybe in the future.
我现在不想切换到 Python3,但可能在未来。
The code contains about 500 calls to unicode()
该代码包含大约 500 次调用 unicode()
How to proceed?
如何进行?
Update
更新
The comment of user vaultah to read the pyportingguide has received several upvotes.
用户 vaultah 阅读pyporting指南的评论 已收到数个赞成票。
My current solution is this (thanks to Peter Brittain):
我目前的解决方案是这样的(感谢 Peter Brittain):
from builtins import str
... I could not find this hint in the pyporting docs.....
...我在pyporting文档中找不到这个提示.....
回答by Peter Brittain
As has already been pointed out in the comments, there is already advice on porting from 2 to 3.
正如评论中已经指出的那样,已经有关于从 2 移植到 3 的建议。
Having recently had to port some of my own code from 2 to 3 and maintain compatibility for each for now, I wholeheartedly recommend using python-future, which provides a great tool to help update your code (futurize
) as well as clear guidance for how to write cross-compatible code.
最近不得不将我自己的一些代码从 2 移植到 3 并暂时保持每个代码的兼容性,我全心全意地推荐使用python-future,它提供了一个很好的工具来帮助更新您的代码 ( futurize
) 以及有关如何操作的明确指导编写交叉兼容的代码。
In your specific case, I would simply convert all calls to unicode to use str and then import str from builtins. Any IDE worth its salt these days will do that global search and replace in one operation.
在您的特定情况下,我只需将所有调用转换为 unicode 以使用 str ,然后从 builtins 导入 str。如今,任何物有所值的 IDE 都会在一次操作中进行全局搜索和替换。
Of course, that's the sort of thing futurize should catch too, if you just want to use automatic conversion (and to look for other potential issues in your code).
当然,如果您只想使用自动转换(并在您的代码中寻找其他潜在问题),那么 futurize 也应该捕捉到这种情况。
回答by Quint
You can test whether there is such a function as unicode()
in the version of Python that you're running. If not, you can create a unicode()
alias for the str()
function, which does in Python 3 what unicode()
did in Python 2, as all strings are unicode in Python 3.
您可以测试unicode()
您运行的 Python 版本中是否存在这样的函数。如果没有,您可以unicode()
为该str()
函数创建一个别名,它在 Python 3 中的作用与unicode()
在 Python 2 中的作用相同,因为在 Python 3 中所有字符串都是 unicode。
# Python 3 compatibility hack
try:
unicode('')
except NameError:
unicode = str
Note that a more complete port is probably a better idea; see the porting guidefor details.
请注意,更完整的端口可能是一个更好的主意;有关详细信息,请参阅移植指南。
回答by Quint
Short answer: Replace all unicode
calls with str
calls.
简短回答:用unicode
通话替换所有str
通话。
Long answer: In Python 3, Unicode was replaced with strings because of its abundance. The following solution should work if you are only using Python 3:
长答案:在 Python 3 中,Unicode 被替换为字符串,因为它很丰富。如果您只使用 Python 3,以下解决方案应该有效:
unicode = str
# the rest of your goes goes here
If you are using it with both Python 2 or Python 3, use this instead:
如果您将它与 Python 2 或 Python 3 一起使用,请改用它:
import sys
if sys.version_info.major == 3:
unicode = str
# the rest of your code goes here
The other way: run this in the command line
另一种方式:在命令行中运行它
$ 2to3 package -w
回答by Gary Wisniewski
First, as a strategy, I would take a small part of your program and try to port it. The number of unicode
calls you are describing suggest to me that your application cares about string representations more than most and each use-case is often different.
首先,作为一种策略,我会从你的程序中取出一小部分并尝试移植它。unicode
您描述的调用次数告诉我,您的应用程序比大多数情况更关心字符串表示,并且每个用例通常都不同。
The important consideration is that all strings are unicode in Python 3. If you are using the str
type to store "bytes" (for example, if they are read from a file), then you should be aware that those will not be bytes in Python3 but will be unicode characters to begin with.
重要的考虑是所有字符串在 Python 3 中都是 unicode。如果您使用该str
类型来存储“字节”(例如,如果它们是从文件中读取的),那么您应该意识到这些在 Python3 中不会是字节,而是以 unicode 字符开头。
Let's look at a few cases.
我们来看几个案例。
First, if you do not have any non-ASCII characters at all and really are not using the Unicode character set, it is easy. Chances are you can simply change the unicode()
function to str()
. That will assure that any object passed as an argument is properly converted. However, it is wishful thinking to assume it's that easy.
首先,如果您根本没有任何非 ASCII 字符并且确实没有使用 Unicode 字符集,那么这很容易。您可以简单地将unicode()
函数更改为str()
. 这将确保作为参数传递的任何对象都被正确转换。然而,假设这很容易是一厢情愿的想法。
Most likely, you'll need to look at the argument to unicode()
to see what it is, and determine how to treat it.
最有可能的是,您需要查看参数以unicode()
了解它是什么,并确定如何处理它。
For example, if you are reading UTF-8 characters from a file in Python 2 and converting them to Unicode your code would look like this:
例如,如果您在 Python 2 中从文件中读取 UTF-8 字符并将它们转换为 Unicode,您的代码将如下所示:
data = open('somefile', 'r').read()
udata = unicode(data)
However, in Python3, read()
returns Unicode data to begin with, and the unicode decoding must be specified when opening the file:
但是在Python3中,read()
返回Unicode数据开头,打开文件时必须指定unicode解码:
udata = open('somefile', 'r', encoding='UTF-8').read()
As you can see, transforming unicode()
simply when porting may depend heavily on how and why the application is doing Unicode conversions, where the data has come from, and where it is going to.
如您所见,unicode()
在移植时进行简单的转换可能在很大程度上取决于应用程序进行 Unicode 转换的方式和原因、数据的来源以及去向。
Python3 brings greater clarity to string representations, which is welcome, but can make porting daunting. For example, Python3 has a proper bytes
type, and you convert byte-data to unicode like this:
Python3 使字符串表示更加清晰,这是受欢迎的,但可能使移植令人生畏。例如,Python3 有一个正确的bytes
类型,您可以像这样将字节数据转换为 unicode:
udata = bytedata.decode('UTF-8')
or convert Unicode data to character form using the opposite transform.
或使用相反的转换将 Unicode 数据转换为字符形式。
bytedata = udata.encode('UTF-8')
I hope this at least helps determine a strategy.
我希望这至少有助于确定策略。