C语言 如何检测 TCP 套接字断开连接(使用 C Berkeley 套接字)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6404008/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to detect a TCP socket disconnection (with C Berkeley socket)
提问by Ayoub M.
I am using a loop to read message out from a c Berkeley socket but I am not able to detect when the socket is disconnected so I would accept a new connection. please help
我正在使用循环从 ac Berkeley 套接字读取消息,但我无法检测到套接字何时断开连接,因此我将接受新连接。请帮忙
while(true) {
bzero(buffer,256);
n = read(newsockfd,buffer,255);
printf("%s\n",buffer);
}
回答by Bruno
The only way you can detect that a socket is connected is by writing to it.
检测套接字已连接的唯一方法是写入它。
Getting a error on read()/recv()will indicate that the connection is broken, but not getting an error when reading doesn't mean that the connection is up.
出现错误read()/recv()表示连接已断开,但读取时未出现错误并不意味着连接已建立。
You may be interested in reading this: http://lkml.indiana.edu/hypermail/linux/kernel/0106.1/1154.html
你可能有兴趣阅读:http: //lkml.indiana.edu/hypermail/linux/kernel/0106.1/1154.html
In addition, using TCP Keep Alivemay help distinguish between inactive and broken connections (by sending something at regular intervals even if there's no data to be sent by the application).
此外,使用TCP Keep Alive可能有助于区分不活动和断开的连接(即使应用程序没有要发送的数据,也可以定期发送一些内容)。
(EDIT: Removed incorrect sentence as pointed out by @Damon, thanks.)
(编辑:删除了@Damon 指出的错误句子,谢谢。)
回答by user207421
Your problem is that you are completely ignoring the result returned by read(). Your code after read()should look at least like this:
你的问题是你完全忽略了返回的结果read()。您之后的代码read()应该至少如下所示:
if (n == 0) // peer disconnected
break;
else if (n == -1) // error
{
perror("read");
break;
}
else // received 'n' bytes
{
printf("%.*s", n, buffer);
}
And accepting a new connection should be done in a separate thread, not dependent on end of stream on this connection.
并且接受一个新的连接应该在一个单独的线程中完成,而不是依赖于这个连接上的流结束。
The bzero()call is pointless, just a workaround for prior errors.
该bzero()调用毫无意义,只是针对先前错误的一种解决方法。
回答by cloudrain21
That's because you didn't use keepalive timeout. In receiving side, keepalive socket option is the best solution for detecting dead connection.
那是因为您没有使用 keepalive timeout。在接收端,keepalive socket 选项是检测死连接的最佳解决方案。
But, in case of your application continue to write to socket, there is something to think more. Even though you already set keepalive option to your application socket, you can't detect in time the dead connection state of the socket, in case of your app keeps writing on the socket. That's because of tcp retransmission by the kernel tcp stack. tcp_retries1 and tcp_retries2 are kernel parameters for configuring tcp retransmission timeout. It's hard to predict precise time of retransmission timeout because it's calculated by RTT mechanism. You can see this computation in rfc793. (3.7. Data Communication)
但是,如果您的应用程序继续写入套接字,则需要考虑更多。即使您已经为应用程序套接字设置了 keepalive 选项,您也无法及时检测套接字的死连接状态,以防您的应用程序继续在套接字上写入。那是因为内核 tcp 堆栈进行了 tcp 重传。tcp_retries1 和 tcp_retries2 是配置 tcp 重传超时的内核参数。重传超时的精确时间很难预测,因为它是通过 RTT 机制计算的。你可以在 rfc793 中看到这个计算。(3.7. 数据通讯)
https://www.rfc-editor.org/rfc/rfc793.txt
https://www.rfc-editor.org/rfc/rfc793.txt
Each platforms have kernel configurations for tcp retransmission.
每个平台都有用于 tcp 重传的内核配置。
Linux : tcp_retries1, tcp_retries2 : (exist in /proc/sys/net/ipv4)
http://linux.die.net/man/7/tcp
http://linux.die.net/man/7/tcp
HPUX : tcp_ip_notify_interval, tcp_ip_abort_interval
http://www.hpuxtips.es/?q=node/53
http://www.hpuxtips.es/?q=node/53
AIX : rto_low, rto_high, rto_length, rto_limit
http://www-903.ibm.com/kr/event/download/200804_324_swma/socket.pdf
http://www-903.ibm.com/kr/event/download/200804_324_swma/socket.pdf
You should set lower value for tcp_retries2 (default 15) if you want to early detect dead connection, but it's not precise time as I already said. In addition, currently you can't set those values only for single socket. Those are global kernel parameters. There was some trial to apply tcp retransmission socket option for single socket(http://patchwork.ozlabs.org/patch/55236/), but I don't think it was applied into kernel mainline. I can't find those options definition in system header files.
如果您想及早检测到死连接,您应该为 tcp_retries2 设置较低的值(默认为 15),但这不是我已经说过的精确时间。此外,目前您不能仅为单个套接字设置这些值。这些是全局内核参数。有一些尝试将 tcp 重传套接字选项应用于单个套接字(http://patchwork.ozlabs.org/patch/55236/),但我认为它没有应用于内核主线。我在系统头文件中找不到这些选项定义。
For reference, you can monitor your keepalive socket option through 'netstat --timers' like below. https://stackoverflow.com/questions/34914278
作为参考,您可以通过如下所示的“netstat --timers”监控您的 keepalive 套接字选项。 https://stackoverflow.com/questions/34914278
netstat -c --timer | grep "192.0.0.1:43245 192.0.68.1:49742"
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (1.92/0/0)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (0.71/0/0)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (9.46/0/1)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (8.30/0/1)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (7.14/0/1)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (5.98/0/1)
tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (4.82/0/1)
In addition, when keepalive timeout ocurrs, you can meet different return events depending on platforms you use, so you must not decide dead connection status only by return events. For example, HP returns POLLERR event and AIX returns just POLLIN event when keepalive timeout occurs. You will meet ETIMEDOUT error in recv() call at that time.
另外,当keepalive timeout发生时,根据使用的平台不同,可能会遇到不同的返回事件,所以一定不能只通过返回事件来决定死连接状态。例如,HP 返回 POLLERR 事件,而 AIX 在 keepalive 超时发生时仅返回 POLLIN 事件。届时您将在 recv() 调用中遇到 ETIMEDOUT 错误。
In recent kernel version(since 2.6.37), you can use TCP_USER_TIMEOUT option will work well. This option can be used for single socket.
在最近的内核版本中(自 2.6.37 起),您可以使用 TCP_USER_TIMEOUT 选项会很好地工作。此选项可用于单个套接字。

