Python unicode:如何针对 unicode 字符串进行测试

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1818263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 23:08:06  来源:igfitidea点击:

Python unicode: how to test against unicode string

pythonunicode

提问by jrara

I have a script like this:

我有一个这样的脚本:

#!/Python26/
# -*- coding: utf-8 -*-

import sys
import xlrd
import xlwt

argset = set(sys.argv[1:])

#----------- import ----------------
wb = xlrd.open_workbook("excelfile.xls")

#----------- script ----------------
#Get the first sheet either by name
sh = wb.sheet_by_name(u'Data')

hlo = []

for i in range(len(sh.col_values(8))):
   if sh.cell(i, 1).value in argset:
        if sh.cell(i, 8).value == '':
            continue
        hlo.append(sh.cell(i, 8).value)

excelfile.xls contains unicode strings and I want to test against these strings from command line:

excelfile.xls 包含 unicode 字符串,我想从命令行测试这些字符串:

C:\>python pythonscript.py p??ty?
pythonscript.py:34: UnicodeWarning: Unicode equal comparison failed to convert both arguments to
icode - interpreting them as being unequal
  if sh.cell(i, 1).value in argset:

How should I modify my code for Unicode?

我应该如何修改我的 Unicode 代码?

回答by Vijay Mathew

Python has a sequence type called unicodewhich will be useful here. These links contain more information to help you regarding this:

Python 有一个称为unicode的序列类型,它在这里很有用。这些链接包含更多信息,可帮助您解决此问题:

回答by Kelmer

Try encoding the Excel unicode to string using cp1252 (windows default unicode) and then testing. I know a lot of people don't recommend this, but this is what sometimes solve my problems.

尝试使用 cp1252(Windows 默认 unicode)将 Excel unicode 编码为字符串,然后进行测试。我知道很多人不建议这样做,但这有时可以解决我的问题。

Pseudo=> if sh.cell(i, 1).value.encode('cp1252') in argset: ...

伪=> if sh.cell(i, 1).value.encode('cp1252') in argset: ...

Br.

兄弟