为什么我的 Python 代码会打印额外的字符“???” 从文本文件中读取时？

Question

提问by vrkratheesh

try:
    data=open('info.txt')
    for each_line in data:
        try:
            (role,line_spoken)=each_line.split(':',1)
            print(role,end='')
            print(' said: ',end='')
            print(line_spoken,end='')
        except ValueError:
            print(each_line)
    data.close()
except IOError:
     print("File is missing")

When printing the file line by line, the code tends to add three unnecessary characters in the front, namely "???".

在逐行打印文件时，代码往往会在前面添加三个不需要的字符，即“???”。

Actual output:

实际输出：

???Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

Expected output:

预期输出：

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

Answer 1

采纳答案by senshin

I can't find a duplicate of this for Python 3, which handles encodings differently from Python 2. So here's the answer: instead of opening the file with the default encoding (which is 'utf-8'), use 'utf-8-sig', which expects and strips off the UTF-8 Byte Order Mark, which is what shows up as ???.

我找不到 Python 3 的副本，它处理编码的方式与 Python 2 不同。所以这里是答案：不要使用默认编码（即'utf-8'）打开文件，而是使用'utf-8-sig'，它期望并去除UTF- 8 字节顺序标记，显示为???.

That is, instead of

也就是说，而不是

data = open('info.txt')

Do

做

data = open('info.txt', encoding='utf-8-sig')

Note that if you're on Python 2, you should see e.g. Python, Encoding output to UTF-8and Convert UTF-8 with BOM to UTF-8 with no BOM in Python. You'll need to do some shenanigans with codecsor with str.decodefor this to work right in Python 2. But in Python 3, all you need to do is set the encoding=parameter when you open the file.

请注意，如果您使用的是 Python 2，您应该会看到例如Python, Encoding output to UTF-8and Convert UTF-8 with BOM to UTF-8 with no BOM in Python。为了在 Python 2 中正常工作，您需要使用codecs或使用一些恶作剧str.decode。但在 Python 3 中，您需要做的就是encoding=在打开文件时设置参数。

Answer 2

回答by gavin

I had a very similar problem when dealing with excel csv files. Initially I had saved my file from the drop down choices as a .csv utf-8(comma delimited) file. Then I saved it as just a .csv(comma delimited) file and all was well. Perhaps there might be something similar issue with a .txt file

在处理 excel csv 文件时，我遇到了一个非常相似的问题。最初，我从下拉选项中将我的文件保存为 .csv utf-8（逗号分隔）文件。然后我将它保存为一个 .csv（逗号分隔）文件，一切都很好。也许 .txt 文件可能存在类似的问题

Answer 3

回答by Giovanni

When I had this happen, it only happened to the very first line of my CSV, both reading and writing. For what I was doing, I just made a "sacrificial" entry at the first location so that those charatcers would get added to my sacrifical entry and not any of the ones I cared about. Definitley not a robust solution but was quick and worked for my purposes.

当我发生这种情况时，它只发生在我的 CSV 的第一行，包括阅读和写作。对于我正在做的事情，我只是在第一个位置创建了一个“牺牲”条目，这样这些字符就会被添加到我的牺牲条目中，而不是我关心的任何一个。Definitley 不是一个强大的解决方案，但速度很快，并且符合我的目的。

为什么我的 Python 代码会打印额外的字符“???” 从文本文件中读取时？

提问by vrkratheesh

采纳答案by senshin

回答by gavin

回答by Giovanni

相关推荐

最近更新

标签

为什么我的 Python 代码会打印额外的字符“???” 从文本文件中读取时？

提问by vrkratheesh

采纳答案by senshin

回答by gavin

回答by Giovanni

相关推荐

npm - “找不到 Python 可执行文件“python”，您可以设置 PYTHON 环境变量。”

Python AttributeError: 'str' 对象没有属性 'strftime'

Python 的 matplotlib.pyplot.quiver 究竟是如何工作的？

Python 尝试输入字符串时出现名称错误

相关推荐

最近更新

标签