Linux 如何正确使用 SO_KEEPALIVE 选项来检测另一端的客户端已关闭?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5435098/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 03:24:28  来源:igfitidea点击:

How to use SO_KEEPALIVE option properly to detect that the client at the other end is down?

clinuxsocketskeep-alive

提问by Durin

so I was trying to learn the usage of option SO_KEEPALIVE in socket programming in C language under Linux environment.

所以想学习一下Linux环境下C语言socket编程中SO_KEEPALIVE选项的用法。

I created a server socket and used my browser to connect to it. It was successful and I was able to read the GET request, but I got stuck on the usage of SO_KEEPALIVE.

我创建了一个服务器套接字并使用我的浏览器连接到它。它成功了,我能够读取 GET 请求,但是我被困在 SO_KEEPALIVE 的用法上。

I checked this link [email protected] but I could not find any example which shows how to use it.

我检查了这个链接[email protected]但我找不到任何显示如何使用它的示例。

As soon as I detect the client's request on accept()function I set the SO_KEEPALIVEoption value 1on the client socket. Now I dunno how to check if the client is down?, How to change the time interval between the probes sent etc.

一旦我检测到客户端对accept()函数的请求,我就会在客户端套接字上设置SO_KEEPALIVE选项值1。现在我不知道如何检查客户端是否关闭?,如何更改发送的探测之间的时间间隔等。

I mean how will I get the signal that the client is down (without reading or writing at the client...I thought I will get some signal when probes are not replied back from client), how should I program it after setting the option SO_KEEPALIVE on.

我的意思是我将如何获得客户端关闭的信号(没有在客户端读取或写入......我以为当探针没有从客户端回复时我会得到一些信号),我应该如何在设置选项后对其进行编程SO_KEEPALIVE 开启。

Also if suppose the probes are sent every 3 secs and the client goes down in between I will not get to know that client is down and I may get SIGPIPE.

此外,如果假设每 3 秒发送一次探测,并且客户端在两者之间出现故障,我将不会知道客户端已关闭,我可能会收到 SIGPIPE。

Anyways importantly I wanna know how to use SO_KEEPALIVE in the code.

无论如何,重要的是我想知道如何在代码中使用 SO_KEEPALIVE。

Thanks a tonne in advance!!!

提前感谢一吨!!!

采纳答案by bdk

To modify the number of probes or the probe intervals, you write values to the /proc filesystem like

要修改探测数量或探测间隔,您可以将值写入 /proc 文件系统,如

 echo 600 > /proc/sys/net/ipv4/tcp_keepalive_time
 echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
 echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes

Note that these values are global for all keepalive enabled sockets on the system, You can also override these settings on a per socket basis when you set the setsockopt, see section 4.2 of the document you linked.

请注意,这些值对于系统上所有启用 keepalive 的套接字都是全局的,您还可以在设置 setsockopt 时在每个套接字的基础上覆盖这些设置,请参阅您链接的文档的第 4.2 节。

You can't "check" the status of the socket from userspace with keepalive. Instead, the kernel is simply more aggressive about forcing the remote end to acknowledge packets, and determining if the socket has gone bad. When you attempt to write to the socket, you will get a SIGPIPE if keepalive has determined remote end is down.

您无法使用 keepalive 从用户空间“检查”套接字的状态。相反,内核只是更积极地强制远程端确认数据包,并确定套接字是否已损坏。当您尝试写入套接字时,如果 keepalive 确定远程端已关闭,您将收到一个 SIGPIPE。

回答by MarkR

You'll get the same result if you enable SO_KEEPALIVE, as if you don't enable SO_KEEPALIVE - typically you'll find the socket ready and get an error when you read from it.

如果您启用 SO_KEEPALIVE,您将获得相同的结果,就像您没有启用 SO_KEEPALIVE 一样 - 通常您会发现套接字已准备就绪,并在读取时收到错误消息。

You can set the keepalive timeout on a per-socket basis under Linux (this may be a Linux-specific feature). I'd recommend this rather than changing the system-wide setting. See the man page for tcp for more info.

您可以在 Linux 下基于每个套接字设置 keepalive 超时(这可能是 Linux 特定的功能)。我建议这样做,而不是更改系统范围的设置。有关详细信息,请参阅 tcp 的手册页。

Finally, if your client is a web browser, it's quite likely that it will close the socket fairly quickly anyway - most of them will only hold keepalive (HTTP 1.1) connections open for a relatively short time (30s, 1 min etc). Of course if the client machine has disappeared or network down (which is what SO_KEEPALIVE is really useful for detecting), then it won't be able to actively close the socket.

最后,如果您的客户端是 Web 浏览器,它很可能无论如何都会很快关闭套接字 - 其中大多数只会在相对较短的时间(30 秒、1 分钟等)内保持保持活动 (HTTP 1.1) 连接打开。当然,如果客户端机器消失或网络关闭(这是 SO_KEEPALIVE 对检测真正有用的),那么它将无法主动关闭套接字。

回答by Chuck Kollars

As already discussed, SO_KEEPALIVE makes the kernel more aggressive about continually verifying the connection even when you're not doing anything, but does notchange or enhance the way the information is delivered to you. You'll find out when you try to actually do something (for example "write"), and you'll find out right away since the kernel is now just reporting the status of a previously set flag, rather than having to wait a few seconds (or much longer in some cases) for network activity to fail. The exact same code logic you had for handling the "other side went away unexpectedly" condition will still be used; what changes is the timing (not the method).

前面已经讨论过,SO_KEEPALIVE使得内核约不断验证,即使你没有做任何的连接更加积极,但并不会改变或增强的信息传递到你的方式。当您尝试实际执行某事(例如“写入”)时,您会发现,并且您会立即发现,因为内核现在只是报告先前设置的标志的状态,而不必等待几个秒(或在某些情况下更长)网络活动失败。将仍然使用与处理“另一侧意外消失”情况完全相同的代码逻辑;改变的是时间(而不是方法)。

Virtually every "practical" sockets program in some way provides non-blocking access to the sockets during the data phase (maybe with select()/poll(), or maybe with fcntl()/O_NONBLOCK/EINPROGRESS&EWOULDBLOCK, or if your kernel supports it maybe with MSG_DONTWAIT). Assuming this is already done for other reasons, it's trivial (sometimes requiring no code at all) to in addition find out right away about a connection dropping. But if the data phase does notalready somehow provide non-blocking access to the sockets, you won't find out about the connection dropping until the next time you try to do something.

几乎每一个以某种方式“实用”套接字程序提供带的fcntl()/ O_NONBLOCK / EINPROGRESS和EWOULDBLOCK,或者如果你的内核支持它在数据阶段(可能使用select()/ poll的-blocking接入插座(),或者可能也许用 MSG_DONTWAIT)。假设由于其他原因已经这样做了,另外立即找出连接断开是微不足道的(有时根本不需要代码)。但是如果数据阶段还没有以某种方式提供对套接字的非阻塞访问,那么直到下次尝试做某事时,您才会发现连接断开。

(A TCP socket connection without some sort of non-blocking behaviour during the data phase is notoriously fragile, as if the wrong packet encounters a network problem it's very easy for the program to then "hang" indefinitely, and there's not a whole lot you can do about it.)

(在数据阶段没有某种非阻塞行为的 TCP 套接字连接是出了名的脆弱,就好像错误的数据包遇到了网络问题一样,程序很容易无限期地“挂起”,并且没有很多你可以这样做。)

回答by Alt Eisen

Short answer, add

简答,补充

int flags =1;
if (setsockopt(sfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };

on the server side, and read()will be unblocked when the client is down.

在服务器端,read()当客户端宕机时会解除阻塞。

A full explanation can be found here.

完整的解释可以在这里找到。