Python 3 TypeError：必须是 str，而不是 sys.stdout.write() 的字节

Question

提问by Michael IV

I was looking for a way to run an external process from python script and print its stdout messages during the execution.
The code below works, but prints no stdout output during runtime. When it exits I am getting the following error:

我正在寻找一种从 python 脚本运行外部进程并在执行期间打印其 stdout 消息的方法。
下面的代码有效，但在运行时不打印 stdout 输出。当它退出时，我收到以下错误：

sys.stdout.write(nextline) TypeError:must be str,not bytes

sys.stdout.write(nextline) TypeError: must be str, not bytes

p = subprocess.Popen(["demo.exe"],stdout = subprocess.PIPE, stderr= subprocess.PIPE)    
# Poll process for new output until finished
while True:
    nextline = p.stdout.readline()
    if nextline == '' and p.poll() != None:
        break
    sys.stdout.write(nextline)
    sys.stdout.flush()

output = p.communicate()[0]
exitCode = p.returncode

I am using python 3.3.2

我正在使用 python 3.3.2

Answer 1

采纳答案by Martin Tournoij

Python 3 handles strings a bit different. Originally there was just one type for strings: str. When unicode gained traction in the '90s the new unicodetype was added to handle Unicode without breaking pre-existing code¹. This is effectively the same as strbut with multibyte support.

Python 3 处理字符串有点不同。最初只有一种类型的字符串：str. 当 unicode 在 90 年代获得牵引力时，unicode添加了新类型来处理 Unicode 而不会破坏预先存在的代码¹。这实际上str与多字节支持相同。

In Python 3 there are two different types:

在 Python 3 中有两种不同的类型：

The bytestype. This is just a sequence of bytes, Python doesn't know anything about how to interpret this as characters.
The strtype. This is also a sequence of bytes, but Python knows how to interpret those bytes as characters.
The separate unicodetype was dropped. strnow supports unicode.

该bytes类型。这只是一个字节序列，Python 不知道如何将其解释为字符。
该str类型。这也是一个字节序列，但 Python 知道如何将这些字节解释为字符。
分离unicode式已被删除。str现在支持unicode。

In Python 2 implicitly assuming an encoding could cause a lot of problems; you could end up using the wrong encoding, or the data may not have an encoding at all (e.g. it's a PNG image).
Explicitly telling Python which encoding to use (or explicitly telling it to guess) is often a lot better and much more in line with the "Python philosophy" of "explicit is better than implicit".

在 Python 2 中，隐式假设编码可能会导致很多问题；您最终可能会使用错误的编码，或者数据可能根本没有编码（例如，它是一个 PNG 图像）。
明确告诉 Python 使用哪种编码（或明确告诉它猜测）通常要好得多，并且更符合“显式优于隐式”的“Python哲学”。

This change is incompatible with Python 2 as many return values have changed, leading to subtle problems like this one; it's probably the main reason why Python 3 adoption has been so slow. Since Python doesn't have static typing²it's impossible to change this automatically with a script (such as the bundled 2to3).

这一变化与 Python 2 不兼容，因为许多返回值发生了变化，导致了像这样的微妙问题；这可能是 Python 3 采用如此缓慢的主要原因。由于 Python 没有静态类型^2，因此不可能使用脚本（例如 bundled 2to3）自动更改它。

You can convert strto byteswith bytes('hllo', 'utf-8'); this should produce b'H\xe2\x82\xacllo'. Note how one character was converted to three bytes.
You can convert bytesto strwith b'H\xe2\x82\xacllo'.decode('utf-8').

您可以转换str到bytes使用bytes('hllo', 'utf-8'); 这应该产生b'H\xe2\x82\xacllo'. 请注意如何将一个字符转换为三个字节。
您可以转换bytes到str使用 b'H\xe2\x82\xacllo'.decode('utf-8')。

Of course, UTF-8 may not be the correct character set in your case, so be sure to use the correct one.

当然，在您的情况下，UTF-8 可能不是正确的字符集，因此请务必使用正确的字符集。

In your specific piece of code, nextlineis of type bytes, not str, reading stdoutand stdinfrom subprocesschanged in Python 3 from strto bytes. This is because Python can't be sure which encoding this uses. It probablyuses the same as sys.stdin.encoding(the encoding of your system), but it can't be sure.

在您的特定代码段中，nextline类型为bytes，而不是str，readstdout和stdinfromsubprocess在 Python 3 from 中更改str为 bytes。这是因为 Python 无法确定它使用的是哪种编码。它可能使用与sys.stdin.encoding（您系统的编码）相同的内容，但不能确定。

You need to replace:

你需要更换：

sys.stdout.write(nextline)

with:

和：

sys.stdout.write(nextline.decode('utf-8'))

or maybe:

或者可能：

sys.stdout.write(nextline.decode(sys.stdout.encoding))

You will also need to modify if nextline == ''to if nextline == b''since:

您还需要修改if nextline == ''为if nextline == b''：

>>> '' == b''
False

Also see the Python 3 ChangeLog, PEP 358, and PEP 3112.

另请参阅Python 3 ChangeLog、PEP 358和PEP 3112。

¹There are some neat tricks you can do with ASCII that you can't do with multibyte character sets; the most famous example is the "xor with space to switch case" (e.g. chr(ord('a') ^ ord(' ')) == 'A') and "set 6th bit to make a control character" (e.g. ord('\t') + ord('@') == ord('I')). ASCII was designed in a time when manipulating individual bits was an operation with a non-negligible performance impact.

¹有一些使用 ASCII 的巧妙技巧可以做到，而使用多字节字符集则无法做到；最著名的例子是“xor with space to switch case”（例如chr(ord('a') ^ ord(' ')) == 'A'）和“设置第6位以制作控制字符”（例如ord('\t') + ord('@') == ord('I')）。ASCII 是在操纵单个位是具有不可忽略的性能影响的操作的时代设计的。

²Yes, you can use function annotations, but it's a comparatively new feature and little used.

²是的，你可以使用函数注解，但它是一个比较新的功能，很少使用。

Answer 2

回答by Wooble

While the accepted answer will work fine if the bytes you have from your subprocess are encoded using sys.stdout.encoding(or a compatible encoding, like reading from a tool that outputs ASCII and your stdout uses UTF-8), the correct way to write arbitrary bytes to stdout is:

如果您的子进程中的字节使用sys.stdout.encoding（或兼容的编码，例如从输出 ASCII 的工具中读取，而您的 stdout 使用 UTF-8）进行编码，则接受的答案将正常工作，但将任意字节写入 stdout 的正确方法是：

sys.stdout.buffer.write(some_bytes_object)

This will just output the bytes as-is, without trying to treat them as text-in-some-encoding.

这只会按原样输出字节，而不会尝试将它们视为某种编码中的文本。

Python 3 TypeError：必须是 str，而不是 sys.stdout.write() 的字节

提问by Michael IV

采纳答案by Martin Tournoij

回答by Wooble

相关推荐

最近更新

标签

Python 3 TypeError：必须是 str，而不是 sys.stdout.write() 的字节

提问by Michael IV

采纳答案by Martin Tournoij

回答by Wooble

相关推荐

Python 未正确调用 DataFrame 构造函数！错误

在python中将字符串转换为十六进制

Python：如何“杀死”一个类实例/对象？

Python 如何从 Django 1.7 中的初始迁移迁移回来？

相关推荐

最近更新

标签