Python UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 434852 中的字节 0xe2: 无效的连续字节

Question

提问by user2181913

I am using hfcca to calculate cyclomatic complexity for a c++ code. hfcca is a simple python script (https://code.google.com/p/headerfile-free-cyclomatic-complexity-analyzer/). When i am trying to run the script to generate the output in the form of an xml file i am getting following errors :

我正在使用 hfcca 来计算 C++ 代码的圈复杂度。hfcca 是一个简单的 Python 脚本（https://code.google.com/p/headerfile-free-cyclomatic-complexity-analyzer/）。当我尝试运行脚本以 xml 文件的形式生成输出时，出现以下错误：

Traceback (most recent call last):
    "./hfcca.py", line 802, in <module>
    main(sys.argv[1:])
    File "./hfcca.py", line 798, in main
    print(xml_output([f for f in r], options))
    File "./hfcca.py", line 798, in <listcomp>
    print(xml_output([f for f in r], options))
    File "/x/home06/smanchukonda/PREFIX/lib/python3.3/multiprocessing/pool.py", line 652, in next
    raise value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte

Please help me with this..

请在这件事上给予我帮助..

Answer 1

回答by monk

The problem looks like the file has characters represented with latin1 that aren't characters in utf8. The fileutility can be useful for figuring out what encoding a file should be treated as, e.g:

问题看起来像该文件具有用 latin1 表示的字符，而这些字符不是 utf8 中的字符。该file实用程序可用于确定文件应被视为何种编码，例如：

monk@monk-VirtualBox:~$ file foo.txt 
foo.txt: UTF-8 Unicode text

Here's what the bytes mean in latin1:

这是latin1中字节的含义：

>>> b'\xe2'.decode('latin1')
'a'

Probably easiest is to convert the files to utf8.

可能最简单的方法是将文件转换为 utf8。

Answer 2

回答by Biashara Employers

I also had the same problem rendering Markup("""yyyyyy""") but i solved it using an online tool with removed the 'bad' characters. https://pteo.paranoiaworks.mobi/diacriticsremover/

我在渲染 Markup("""yyyyyy""") 时也遇到了同样的问题，但我使用在线工具解决了这个问题，并删除了“坏”字符。https://pteo.paranoiaworks.mobi/diacriticsremover/

It is a nice tool and works even offline.

这是一个不错的工具，甚至可以离线使用。

Python UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 434852 中的字节 0xe2: 无效的连续字节

提问by user2181913

回答by monk

回答by Biashara Employers

相关推荐

最近更新

标签

Python UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 434852 中的字节 0xe2: 无效的连续字节

提问by user2181913

回答by monk

回答by Biashara Employers

相关推荐

Python DataFrame 中的字符串，但 dtype 是对象

Python 字典中的最后一个键

在python中的文件中写入多行

Python x % 2 ==0 是什么意思？

相关推荐

最近更新

标签