Python 3 TypeError:必须是 str,而不是 sys.stdout.write() 的字节
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21689365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python 3 TypeError: must be str, not bytes with sys.stdout.write()
提问by Michael IV
I was looking for a way to run an external process from python script and print its stdout messages during the execution.
The code below works, but prints no stdout output during runtime. When it exits I am getting the following error:
我正在寻找一种从 python 脚本运行外部进程并在执行期间打印其 stdout 消息的方法。
下面的代码有效,但在运行时不打印 stdout 输出。当它退出时,我收到以下错误:
sys.stdout.write(nextline) TypeError:must be str,not bytes
sys.stdout.write(nextline) TypeError: must be str, not bytes
p = subprocess.Popen(["demo.exe"],stdout = subprocess.PIPE, stderr= subprocess.PIPE)
# Poll process for new output until finished
while True:
nextline = p.stdout.readline()
if nextline == '' and p.poll() != None:
break
sys.stdout.write(nextline)
sys.stdout.flush()
output = p.communicate()[0]
exitCode = p.returncode
I am using python 3.3.2
我正在使用 python 3.3.2
采纳答案by Martin Tournoij
Python 3 handles strings a bit different. Originally there was just one type for
strings: str. When unicode gained traction in the '90s the new unicodetype
was added to handle Unicode without breaking pre-existing code1. This is
effectively the same as strbut with multibyte support.
Python 3 处理字符串有点不同。最初只有一种类型的字符串:str. 当 unicode 在 90 年代获得牵引力时,unicode添加了新类型来处理 Unicode 而不会破坏预先存在的代码1。这实际上str与多字节支持相同。
In Python 3 there are two different types:
在 Python 3 中有两种不同的类型:
- The
bytestype. This is just a sequence of bytes, Python doesn't know anything about how to interpret this as characters. - The
strtype. This is also a sequence of bytes, but Python knows how to interpret those bytes as characters. - The separate
unicodetype was dropped.strnow supports unicode.
- 该
bytes类型。这只是一个字节序列,Python 不知道如何将其解释为字符。 - 该
str类型。这也是一个字节序列,但 Python 知道如何将这些字节解释为字符。 - 分离
unicode式已被删除。str现在支持unicode。
In Python 2 implicitly assuming an encoding could cause a lot of problems; you
could end up using the wrong encoding, or the data may not have an encoding at
all (e.g. it's a PNG image).
Explicitly telling Python which encoding to use (or explicitly telling it to
guess) is often a lot better and much more in line with the "Python philosophy"
of "explicit is better than implicit".
在 Python 2 中,隐式假设编码可能会导致很多问题;您最终可能会使用错误的编码,或者数据可能根本没有编码(例如,它是一个 PNG 图像)。
明确告诉 Python 使用哪种编码(或明确告诉它猜测)通常要好得多,并且更符合“显式优于隐式”的“Python哲学”。
This change is incompatible with Python 2 as many return values have changed,
leading to subtle problems like this one; it's probably the main reason why
Python 3 adoption has been so slow. Since Python doesn't have static typing2it's impossible to change this automatically with a script (such as the bundled
2to3).
这一变化与 Python 2 不兼容,因为许多返回值发生了变化,导致了像这样的微妙问题;这可能是 Python 3 采用如此缓慢的主要原因。由于 Python 没有静态类型2,因此不可能使用脚本(例如 bundled 2to3)自动更改它
。
- You can convert
strtobyteswithbytes('hllo', 'utf-8'); this should produceb'H\xe2\x82\xacllo'. Note how one character was converted to three bytes. - You can convert
bytestostrwithb'H\xe2\x82\xacllo'.decode('utf-8').
- 您可以转换
str到bytes使用bytes('hllo', 'utf-8'); 这应该产生b'H\xe2\x82\xacllo'. 请注意如何将一个字符转换为三个字节。 - 您可以转换
bytes到str使用b'H\xe2\x82\xacllo'.decode('utf-8')。
Of course, UTF-8 may not be the correct character set in your case, so be sure to use the correct one.
当然,在您的情况下,UTF-8 可能不是正确的字符集,因此请务必使用正确的字符集。
In your specific piece of code, nextlineis of type bytes, not str,
reading stdoutand stdinfrom subprocesschanged in Python 3 from strto
bytes. This is because Python can't be sure which encoding this uses. It
probablyuses the same as sys.stdin.encoding(the encoding of your system),
but it can't be sure.
在您的特定代码段中,nextline类型为bytes,而不是str,readstdout和stdinfromsubprocess在 Python 3 from 中更改str为
bytes。这是因为 Python 无法确定它使用的是哪种编码。它
可能使用与sys.stdin.encoding(您系统的编码)相同的内容,但不能确定。
You need to replace:
你需要更换:
sys.stdout.write(nextline)
with:
和:
sys.stdout.write(nextline.decode('utf-8'))
or maybe:
或者可能:
sys.stdout.write(nextline.decode(sys.stdout.encoding))
You will also need to modify if nextline == ''to if nextline == b''since:
您还需要修改if nextline == ''为if nextline == b'':
>>> '' == b''
False
Also see the Python 3 ChangeLog, PEP 358, and PEP 3112.
另请参阅Python 3 ChangeLog、PEP 358和PEP 3112。
1There are some neat tricks you can do with ASCII that you can't do with multibyte character sets; the most famous example is the "xor with space to switch case" (e.g. chr(ord('a') ^ ord(' ')) == 'A') and "set 6th bit to make a control character" (e.g. ord('\t') + ord('@') == ord('I')). ASCII was designed in a time when manipulating individual bits was an operation with a non-negligible performance impact.
1有一些使用 ASCII 的巧妙技巧可以做到,而使用多字节字符集则无法做到;最著名的例子是“xor with space to switch case”(例如chr(ord('a') ^ ord(' ')) == 'A')和“设置第6位以制作控制字符”(例如ord('\t') + ord('@') == ord('I'))。ASCII 是在操纵单个位是具有不可忽略的性能影响的操作的时代设计的。
2Yes, you can use function annotations, but it's a comparatively new feature and little used.
2是的,你可以使用函数注解,但它是一个比较新的功能,很少使用。
回答by Wooble
While the accepted answer will work fine if the bytes you have from your subprocess are encoded using sys.stdout.encoding(or a compatible encoding, like reading from a tool that outputs ASCII and your stdout uses UTF-8), the correct way to write arbitrary bytes to stdout is:
如果您的子进程中的字节使用sys.stdout.encoding(或兼容的编码,例如从输出 ASCII 的工具中读取,而您的 stdout 使用 UTF-8)进行编码,则接受的答案将正常工作,但将任意字节写入 stdout 的正确方法是:
sys.stdout.buffer.write(some_bytes_object)
This will just output the bytes as-is, without trying to treat them as text-in-some-encoding.
这只会按原样输出字节,而不会尝试将它们视为某种编码中的文本。

