Python Socket接收大量数据

Question

提问by user2585107

When I try to receive larger amounts of data it gets cut off and I have to press enter to get the rest of the data. At first I was able to increase it a little bit but it still won't receive all of it. As you can see I have increased the buffer on the conn.recv() but it still doesn't get all of the data. It cuts it off at a certain point. I have to press enter on my raw_input in order to receive the rest of the data. Is there anyway I can get all of the data at once? Here's the code.

当我尝试接收大量数据时，它会被切断，我必须按 Enter 才能获取其余数据。起初我能够增加一点，但它仍然不会收到全部。正如你所看到的，我已经增加了 conn.recv() 的缓冲区，但它仍然没有得到所有的数据。它在某个点切断它。我必须在我的 raw_input 上按 Enter 才能接收其余的数据。无论如何我可以一次获取所有数据吗？这是代码。

port = 7777
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('0.0.0.0', port))
sock.listen(1)
print ("Listening on port: "+str(port))
while 1:
    conn, sock_addr = sock.accept()
    print "accepted connection from", sock_addr
    while 1:
        command = raw_input('shell> ')
            conn.send(command)
                data = conn.recv(8000)
                if not data: break
                print data,
    conn.close()

Answer 1

采纳答案by Adam Rosenfield

TCP/IP is a stream-basedprotocol, not a message-basedprotocol. There's no guarantee that every send()call by one peer results in a single recv()call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv()calls, due to packet fragmentation.

TCP/IP 是基于流的协议，而不是基于消息的协议。不能保证send()一个对等方的每次调用都会导致recv()另一个对等方接收到发送的确切数据的单个调用——recv()由于数据包碎片，它可能会分块接收数据，拆分到多个调用中。

You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv()until you've read an entire message or an error occurs.

您需要在 TCP 之上定义自己的基于消息的协议，以区分消息边界。然后，要阅读一条消息，您可以继续调用，recv()直到阅读完整条消息或发生错误为止。

One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:

发送消息的一种简单方法是在每条消息前加上其长度。然后要读取消息，首先读取长度，然后读取那么多字节。您可以这样做：

def send_msg(sock, msg):
    # Prefix each message with a 4-byte length (network byte order)
    msg = struct.pack('>I', len(msg)) + msg
    sock.sendall(msg)

def recv_msg(sock):
    # Read message length and unpack it into an integer
    raw_msglen = recvall(sock, 4)
    if not raw_msglen:
        return None
    msglen = struct.unpack('>I', raw_msglen)[0]
    # Read the message data
    return recvall(sock, msglen)

def recvall(sock, n):
    # Helper function to recv n bytes or return None if EOF is hit
    data = bytearray()
    while len(data) < n:
        packet = sock.recv(n - len(data))
        if not packet:
            return None
        data.extend(packet)
    return data

Then you can use the send_msgand recv_msgfunctions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.

然后您可以使用send_msg和recv_msg函数来发送和接收整个消息，并且它们不会在网络级别拆分或合并数据包时出现任何问题。

Answer 2

回答by Jeremy Friesner

You may need to call conn.recv() multiple times to receive all the data. Calling it a single time is not guaranteed to bring in all the data that was sent, due to the fact that TCP streams don't maintain frame boundaries (i.e. they only work as a stream of raw bytes, not a structured stream of messages).

您可能需要多次调用 conn.recv() 来接收所有数据。由于 TCP 流不维护帧边界（即它们仅用作原始字节流，而不是结构化消息流），因此不能保证一次调用它会引入所有发送的数据.

See this answerfor another description of the issue.

有关该问题的其他描述，请参阅此答案。

Note that this means you need some way of knowing when you have received all of the data. If the sender will always send exactly 8000 bytes, you could count the number of bytes you have received so far and subtract that from 8000 to know how many are left to receive; if the data is variable-sized, there are various other methods that can be used, such as having the sender send a number-of-bytes header before sending the message, or if it's ASCII text that is being sent you could look for a newline or NUL character.

请注意，这意味着您需要通过某种方式知道何时收到了所有数据。如果发送方总是准确地发送 8000 个字节，您可以计算到目前为止接收到的字节数，然后从 8000 中减去它以知道还剩下多少要接收；如果数据是可变大小的，则可以使用其他各种方法，例如让发件人在发送消息之前发送一个字节数的标头，或者如果发送的是 ASCII 文本，您可以查找换行符或 NUL 字符。

Answer 3

回答by JadedTuna

You can use it as: data = recvall(sock)

您可以将其用作： data = recvall(sock)

def recvall(sock):
    BUFF_SIZE = 4096 # 4 KiB
    data = b''
    while True:
        part = sock.recv(BUFF_SIZE)
        data += part
        if len(part) < BUFF_SIZE:
            # either 0 or end of data
            break
    return data

Answer 4

回答by sjMoquin

Modifying Adam Rosenfield's code:

修改 Adam Rosenfield 的代码：

import sys


def send_msg(sock, msg):
    size_of_package = sys.getsizeof(msg)
    package = str(size_of_package)+":"+ msg #Create our package size,":",message
    sock.sendall(package)

def recv_msg(sock):
    try:
        header = sock.recv(2)#Magic, small number to begin with.
        while ":" not in header:
            header += sock.recv(2) #Keep looping, picking up two bytes each time

        size_of_package, separator, message_fragment = header.partition(":")
        message = sock.recv(int(size_of_package))
        full_message = message_fragment + message
        return full_message

    except OverflowError:
        return "OverflowError."
    except:
        print "Unexpected error:", sys.exc_info()[0]
        raise

I would, however, heavily encourage using the original approach.

但是，我强烈鼓励使用原始方法。

Answer 5

回答by yoniLavi

A variation using a generator function (which I consider more pythonic):

使用生成器函数的变体（我认为它更像 Python）：

def recvall(sock, buffer_size=4096):
    buf = sock.recv(buffer_size)
    while buf:
        yield buf
        if len(buf) < buffer_size: break
        buf = sock.recv(buffer_size)
# ...
with socket.create_connection((host, port)) as sock:
    sock.sendall(command)
    response = b''.join(recvall(sock))

Answer 6

回答by Mina Gabriel

The accepted answer is fine but it will be really slow with big files -string is an immutable class this means more objects are created every time you use the +sign, using listas a stack structure will be more efficient.

接受的答案很好，但是大文件会很慢 - 字符串是一个不可变的类，这意味着每次使用+符号时都会创建更多的对象，list用作堆栈结构会更有效。

This should work better

这应该工作得更好

while True: 
    chunk = s.recv(10000)
    if not chunk: 
        break
    fragments.append(chunk)

print "".join(fragments)

Answer 7

回答by John Albert

You can do it using Serialization

您可以使用序列化来做到这一点

from socket import *
from json import dumps, loads

def recvall(conn):
    data = ""
    while True:
    try:
        data = conn.recv(1024)
        return json.loads(data)
    except ValueError:
        continue

def sendall(conn):
    conn.sendall(json.dumps(data))

NOTE: If you want to shara a file using code above you need to encode / decode it into base64

注意：如果您想使用上面的代码对文件进行 shara，您需要将其编码/解码为 base64

Answer 8

回答by Jacob Stern

Most of the answers describe some sort of recvall()method. If your bottleneck when receiving data is creating the byte array in a forloop, I benchmarked three approaches of allocating the received data in the recvall()method:

大多数答案都描述了某种recvall()方法。如果接收数据时的瓶颈是在for循环中创建字节数组，我对在方法中分配接收数据的三种方法进行了基准测试recvall()：

Byte string method:

字节串方法：

arr = b''
while len(arr) < msg_len:
    arr += sock.recv(max_msg_size)

List method:

列表方法：

fragments = []
while True: 
    chunk = sock.recv(max_msg_size)
    if not chunk: 
        break
    fragments.append(chunk)
arr = b''.join(fragments)

Pre-allocated bytearraymethod:

预分配bytearray方法：

arr = bytearray(msg_len)
pos = 0
while pos < msg_len:
    arr[pos:pos+max_msg_size] = sock.recv(max_msg_size)
    pos += max_msg_size

Results:

结果：

Answer 9

回答by vatsug

For anyone else who's looking for an answer in cases where you don't know the length of the packet prior. Here's a simple solution that reads 4096 bytes at a time and stops when less than 4096 bytes were received. However, it will not work in cases where the total length of the packet received is exactly 4096 bytes - then it will call recv()again and hang.

对于在您事先不知道数据包长度的情况下寻找答案的任何其他人。这是一个简单的解决方案，它一次读取 4096 个字节并在接收到的字节少于 4096 个字节时停止。但是，在接收到的数据包的总长度恰好为 4096 字节的情况下，它将不起作用 - 然后它将recv()再次调用并挂起。

def recvall(sock):
    data = b''
    bufsize = 4096
    while True:
        packet = sock.recv(bufsize)
        data += packet
        if len(packet) < bufsize:
            break
    return data

Answer 10

回答by tomCLANCC

I think this question has been pretty well answered, but I just wanted to add a method using Python 3.8 and the new assignment expression (walrus operator) since it is stylistically simple.

我认为这个问题已经得到很好的回答，但我只是想添加一个使用 Python 3.8 和新赋值表达式（海象运算符）的方法，因为它在风格上很简单。

import socket

host = "127.0.0.1"
port = 31337
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen()
con, addr = s.accept()
msg_list = []

while (walrus_msg := con.recv(3)) != b'\r\n':
    msg_list.append(walrus_msg)

print(msg_list)

In this case, 3 bytes are received from the socket and immediately assigned to walrus_msg. Once the socket receives a b'\r\n'it breaks the loop. walrus_msgare added to a msg_listand printed after the loop breaks. This script is basic but was tested and works with a telnet session.

在这种情况下，从套接字接收 3 个字节并立即分配给walrus_msg。一旦套接字接收到b'\r\n'它，它就会中断循环。walrus_msg添加到 amsg_list并在循环中断后打印。此脚本是基本的，但经过测试，可与 telnet 会话一起使用。

NOTE: The parenthesis around the (walrus_msg := con.recv(3))are needed. Without this, while walrus_msg := con.recv(3) != b'\r\n':evaluates walrus_msgto Trueinstead of the actual data on the socket.

注意：周围的括号(walrus_msg := con.recv(3))是必需的。没有这个，while walrus_msg := con.recv(3) != b'\r\n':评估walrus_msg为True而不是套接字上的实际数据。

Python Socket接收大量数据

提问by user2585107

采纳答案by Adam Rosenfield

回答by Jeremy Friesner

回答by JadedTuna

回答by sjMoquin

回答by yoniLavi

回答by Mina Gabriel

回答by John Albert

回答by Jacob Stern

回答by vatsug

回答by tomCLANCC

相关推荐

最近更新

标签

Python Socket接收大量数据

提问by user2585107

采纳答案by Adam Rosenfield

回答by Jeremy Friesner

回答by JadedTuna

回答by sjMoquin

回答by yoniLavi

回答by Mina Gabriel

回答by John Albert

回答by Jacob Stern

回答by vatsug

回答by tomCLANCC

相关推荐

Python：删除除法小数点

Python Pylint：覆盖单个文件中的最大行长度

Python 如何围绕现有数据库构建烧瓶应用程序？

python文本框中文本和滚动条的自动滚动

相关推荐

最近更新

标签