Python - 将 sock.recv 转换为字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13979764/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:05:18  来源:igfitidea点击:

Python - converting sock.recv to string

pythonstructpython-3.xrecv

提问by coffeemonitor

I'm digging around with python and networking.

我正在研究 python 和网络。

while True:
   data = sock.recv(10240)

This is definitely listening. But it seems to need to be converted to a text string.

这绝对是在听。但是好像需要转换成文本字符串。

I've seen some people using struct.unpack(), but I'm not sure exactly how it works. What's the way to convert?

我见过一些人使用struct.unpack(),但我不确定它是如何工作的。有什么方法可以转换?

采纳答案by abarnert

What you get back from recvis a bytesstring:

你得到的recv是一个bytes字符串:

Receive data from the socket. The return value is a bytes object representing the data received.

从套接字接收数据。返回值是一个字节对象,表示接收到的数据。

In Python 3.x, to convert a bytesstring into a Unicode text strstring, you have to know what character set the string is encoded with, so you can call decode. For example, if it's UTF-8:

在 Python 3.x 中,要将bytes字符串转换为 Unicode 文本str字符串,您必须知道字符串是用什么字符集编码的,因此您可以调用decode. 例如,如果它是 UTF-8:

stringdata = data.decode('utf-8')

(In Python 2.x, bytesis the same thing as str, so you've already gota string. But if you want to get a Unicode text unicodestring, it's the same as in 3.x.)

(在 Python 2.x 中,bytes与 相同str,所以你已经得到了一个字符串。但是如果你想得到一个 Unicode 文本unicode字符串,它与 3.x 中的相同。)

The reason people often use structis that the data isn't just 8-bit or Unicode text, but some other format. For example, you might send each message as a "netstring": a length (as a string of ASCII digits) followed by a :separator, then lengthbytes of UTF-8, then a ,—such as b"3:Abc,". (There are variants on the format, but this is the Bernstein standard netstring.)

人们经常使用的原因struct是数据不仅仅是 8 位或 Unicode 文本,而是一些其他格式。例如,您可以将每条消息作为“网络字符串”发送:一个长度(作为 ASCII 数字字符串)后跟一个:分隔符,然后length是 UTF-8 字节,然后是一个,— 例如b"3:Abc,". (格式有多种变体,但这是 Bernstein 标准网络字符串。)

The reason people use netstrings, or other similar techniques, is that you need some way to delimit messages when you're using TCP. Each recvcould give you half of what the other side passed with send, or it could give your 3 sends and part of the 4th. So, you have to accumulate a buffer of recvdata, and then pull the messages out of it. And you need some way to tell when one message ends and the next begins. If you're just sending plain text messages without any newlines, you can just use newlines as a delimiter. Otherwise, you'll have to come up with something else—maybe netstrings, or using \0as a delimiter, or using newlines as a delimiter but escaping actual newlines within the data, or using some self-delimited structured format like JSON.

人们使用网络字符串或其他类似技术的原因是,当您使用 TCP 时,您需要某种方式来分隔消息。每个都recv可以给你对方传递的一半send,或者它可以给你 3send秒和 4 秒的一部分。因此,您必须积累一个recv数据缓冲区,然后从中提取消息。并且您需要某种方式来判断一条消息何时结束以及下一条消息何时开始。如果您只是发送没有任何换行符的纯文本消息,则可以仅使用换行符作为分隔符。否则,您将不得不想出其他方法——可能是网络字符串,或者\0用作分隔符,或者使用换行符作为分隔符但转义数据中的实际换行符,或者使用一些自分隔的结构化格式,如 JSON。

回答by Joshua D. Boyd

In Python 2.7.x and before, datais already a string. In Python 3.x, datais a bytes object. TO convert bytes to string, use the decode()method. decode()will require a codec argument, like 'utf-8'.

在 Python 2.7.x 及之前,data已经是一个字符串。在 Python 3.x 中,data是一个字节对象。要将字节转换为字符串,请使用该decode()方法。 decode()将需要一个编解码器参数,如“utf-8”。