C++ 我该怎么做才能避免接收方的 TCP 零窗口/ TCP 窗口已满?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3433520/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 12:50:07  来源:igfitidea点击:

What can I do to avoid TCP Zero Window/ TCP Window Full on the receiver side?

c++tcpcross-platformnetwork-programming

提问by rkellerm

I have a small application which sends files over the network to an agent located on a Windows OS.

我有一个小应用程序,它通过网络将文件发送到位于 Windows 操作系统上的代理。

When this application runs on Windows, everything works fine, the communication is OK and the files are all copied successfully.

当这个应用程序在 Windows 上运行时,一切正常,通信正常,文件全部复制成功。

But, when this application runs on Linux (RedHat 5.3, the receiver is still Windows) - I see in Wireshark network trace messages of TCP Zero Window and TCP Window Full to appear on each 1-2 seconds. The agent then closes the connection after some minutes.

但是,当这个应用程序在 Linux 上运行时(RedHat 5.3,接收器仍然是 Windows) - 我在 Wireshark 网络跟踪消息中看到 TCP Zero Window 和 TCP Window Full 每隔 1-2 秒出现一次。然后代理会在几分钟后关闭连接。

The Windows - Linux code is almost the same, and pretty simple. The only non-trivial operation is setsockopt with SO_SNDBUF and value of 0xFFFF. Removing this code didn't help.

Windows - Linux 代码几乎相同,而且非常简单。唯一重要的操作是带有 SO_SNDBUF 和 0xFFFF 值的 setsockopt。删除此代码没有帮助。

Can someone please help me with this issue?

有人可以帮我解决这个问题吗?

EDIT:adding the sending code - it looks that it handles properly partial writes:

编辑:添加发送代码 - 看起来它可以正确处理部分写入:

int totalSent=0;
while(totalSent != dataLen)
{
    int bytesSent 
        = ::send(_socket,(char *)(data+totalSent), dataLen-totalSent, 0);

    if (bytesSent ==0) {
        return totalSent;
    }
    else if(bytesSent == SOCKET_ERROR){
#ifdef __WIN32
        int errcode = WSAGetLastError();
        if( errcode==WSAEWOULDBLOCK ){
#else
            if ((errno == EWOULDBLOCK) || (errno == EAGAIN)) {
#endif
            }
            else{
                if( !totalSent ) {
                    totalSent = SOCKET_ERROR;
                }
                break;
            }
        }
        else{
            totalSent+=bytesSent;
        }
    }
}

Thanks in advance.

提前致谢。

采纳答案by rkellerm

I tried to disable Nagle's algorithm (with TCP_NODELAY), and somehow, it helped. Transfer rate is much higher, TCP window size isn't being full or reset. The strange thing is that when I chaged the window size it didn't have any impact.

我试图禁用 Nagle 的算法(使用 TCP_NODELAY),但不知何故,它有所帮助。传输速率要高得多,TCP 窗口大小未满或未重置。奇怪的是,当我改变窗口大小时,它没有任何影响。

Thank you.

谢谢你。

回答by Robert S. Barnes

Not seeing your code I'll have to guess.

没有看到你的代码我不得不猜测。

The reason you get a Zero window in TCP is because there is no room in the receiver's recv buffer.

在 TCP 中获得零窗口的原因是接收方的 recv 缓冲区中没有空间。

There are a number of ways this can occur. One common cause of this problem is when you are sending over a LAN or other relatively fast network connection and one computer is significantly faster than the other computer. As an extreme example, say you've got a 3Ghz computer sending as fast as possible over a Gigabit Ethernet to another machine that's running a 1Ghz cpu. Since the sender can send much faster than the receiver is able to read then the receiver's recv buffer will fill up causing the TCP stack to advertise a Zero window to the sender.

发生这种情况的方式有多种。此问题的一个常见原因是当您通过 LAN 或其他相对较快的网络连接发送时,一台计算机的速度明显快于另一台计算机。举一个极端的例子,假设你有一台 3Ghz 的计算机通过千兆以太网尽可能快地发送到另一台运行 1Ghz cpu 的机器。由于发送方的发送速度比接收方能够读取的速度快得多,因此接收方的 recv 缓冲区将填满,导致 TCP 堆栈向发送方通告零窗口。

Now this can cause problems on both the sending and receiving sides if they're not both ready to deal with this. On the sending side this can cause the send buffer to fill up and calls to send either to block or fail if you're using non-blocking I/O. On the receiving side you could be spending so much time on I/O that the application has no chance to process any of it's data and giving the appearance of being locked up.

现在,如果发送方和接收方都没有准备好处理这个问题,这可能会导致双方出现问题。在发送端,如果您使用非阻塞 I/O,这会导致发送缓冲区填满并调用发送阻塞或失败。在接收端,您可能会在 I/O 上花费太多时间,以至于应用程序没有机会处理它的任何数据并呈现被锁定的外观。

Edit

编辑

From some of your answers and code it sounds like your app is single threaded and you're trying to do non-Blocking sends for some reason. I assume you're setting the socket to non-Blocking in some other part of the code.

从您的一些答案和代码中,听起来您的应用程序是单线程的,并且您出于某种原因尝试进行非阻塞发送。我假设您在代码的其他部分将套接字设置为非阻塞。

Generally, I would say that this is not a good idea. Ideally, if you're worried about your app hanging on a send(2)you should set a long timeout on the socket using setsockoptand use a separate thread for the actual sending.

一般来说,我会说这不是一个好主意。理想情况下,如果你担心你的应用程序挂在一个send(2)你应该使用的套接字上设置一个长超时setsockopt并使用一个单独的线程进行实际发送。

See socket(7):

套接字(7)

SO_RCVTIMEO and SO_SNDTIMEO Specify the receiving or sending timeouts until reporting an error. The parameter is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout.

SO_RCVTIMEO 和 SO_SNDTIMEO 指定接收或发送超时直到报告错误。参数是一个结构体timeval。如果输入或输出函数在这段时间内阻塞,并且已经发送或接收数据,则该函数的返回值将是传输的数据量;如果没有数据传输并且超时,则返回 -1,并将 errno 设置为 EAGAIN 或 EWOULDBLOCK,就像套接字被指定为非阻塞一样。如果超时设置为零(默认值),则操作永远不会超时。

Your main thread can push each file descriptor into a queueusing say a boost mutex for queue access, then start 1 - N threads to do the actual sending using blocking I/O with send timeouts.

您的主线程可以将每个文件描述符推送到queue使用队列访问的 boost 互斥锁中,然后启动 1 - N 个线程以使用具有发送超时的阻塞 I/O 进行实际发送。

Your send function should look something like this ( assuming you're setting a timeout ):

你的发送函数应该是这样的(假设你设置了超时):

// blocking send, timeout is handled by caller reading errno on short send
int doSend(int s, const void *buf, size_t dataLen) {    
    int totalSent=0;

    while(totalSent != dataLen)
    {
        int bytesSent 
            = send(s,((char *)data)+totalSent, dataLen-totalSent, MSG_NOSIGNAL);

        if( bytesSent < 0 && errno != EINTR )
            break;

        totalSent += bytesSent;
    }
    return totalSent;
}

The MSG_NOSIGNALflag ensures that your application isn't killed by writing to a socket that's been closed or reset by the peer. Sometimes I/O operations are interupted by signals, and checking for EINTRallows you to restart the send.

MSG_NOSIGNAL标志可确保您的应用程序不会因写入已被对等方关闭或重置的套接字而终止。有时 I/O 操作会被信号中断,检查是否EINTR允许您重新启动send.

Generally, you should call doSendin a loop with chunks of data that are of TCP_MAXSEGsize.

通常,您应该doSend在具有一定大小的数据块的循环中调用TCP_MAXSEG

On the receive side you can write a similar blocking recv function using a timeout in a separate thread.

在接收端,您可以在单独的线程中使用超时编写类似的阻塞 recv 函数。

回答by Jo?o Pinto

A common mistake when developing with TCP sockets is about incorrect assumption about read()/write() behavior.

使用 TCP 套接字进行开发时的一个常见错误是关于 read()/write() 行为的错误假设。

When you perform a read/write operation you must check the return value, they may not have read/write the requested of bytes, you usually need a loop to keep track and make sure the entire data was transfered.

当您执行读/写操作时,您必须检查返回值,他们可能没有读/写请求的字节,您通常需要一个循环来跟踪并确保整个数据都已传输。

回答by janm

The most likely problem is that you have a bug in your code where you don't handle partial reads or partial writes correctly. TCP between Linux and Windows is known to work.

最可能的问题是您的代码中存在错误,您没有正确处理部分读取或部分写入。众所周知,Linux 和 Windows 之间的 TCP 可以工作。