Linux 检测套接字挂断而不发送或接收?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5686490/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Detect socket hangup without sending or receiving?
提问by Matt Joiner
I'm writing a TCP server that can take 15 seconds or more to begin generating the body of a response to certain requests. Some clients like to close the connection at their end if the response takes more than a few seconds to complete.
我正在编写一个 TCP 服务器,它可能需要 15 秒或更长时间才能开始生成对某些请求的响应正文。如果响应需要超过几秒钟才能完成,一些客户端喜欢在他们结束时关闭连接。
Since generating the response is very CPU-intensive, I'd prefer to halt the task the instant the client closes the connection. At present, I don't find this out until I send the first payload and receive various hang-up errors.
由于生成响应非常占用 CPU,因此我更愿意在客户端关闭连接时立即停止任务。目前,直到我发送第一个有效负载并收到各种挂断错误后,我才发现这一点。
How can I detect that the peer has closed the connection without sending or receiving any data? That means for recv
that all data remains in the kernel, or for send
that no data is actually transmitted.
如何在不发送或接收任何数据的情况下检测对等方已关闭连接?这意味着recv
所有数据都保留在内核中,或者send
实际上没有数据传输。
采纳答案by asc99c
I've had a recurring problem communicating with equipment that had separate TCP links for send and receive. The basic problem is that the TCP stack doesn't generally tell you a socket is closed when you're just trying to read - you have to try and write to get told the other end of the link was dropped. Partly, that is just how TCP was designed (reading is passive).
我在与具有单独的 TCP 链接进行发送和接收的设备进行通信时经常遇到问题。基本问题是,TCP 堆栈通常不会在您尝试读取时告诉您套接字已关闭-您必须尝试写入才能被告知链接的另一端已删除。部分地,这就是 TCP 的设计方式(读取是被动的)。
I'm guessing Blair's answer works in the cases where the socket has been shut down nicely at the other end (i.e. they have sent the proper disconnection messages), but not in the case where the other end has impolitely just stopped listening.
我猜布莱尔的回答适用于套接字在另一端很好地关闭的情况(即他们发送了正确的断开连接消息),但不适用于另一端不礼貌地停止监听的情况。
Is there a fairly fixed-format header at the start of your message, that you can begin by sending, before the whole response is ready? e.g. an XML doctype? Also are you able to get away with sending some extra spaces at some points in the message - just some null data that you can output to be sure the socket is still open?
在您的消息的开头是否有一个相当固定格式的标头,您可以在整个响应准备好之前通过发送开始?例如一个 XML 文档类型?您是否也可以在消息中的某些点发送一些额外的空格——只是一些可以输出以确保套接字仍然打开的空数据?
回答by Blair
The selectmodule contains what you'll need. If you only need Linux support and have a sufficiently recent kernel, select.epoll()
should give you the information you need. Most Unix systems will support select.poll()
.
该选择模块包含你所需要的。如果您只需要 Linux 支持并且拥有足够新的内核,那么select.epoll()
应该为您提供所需的信息。大多数 Unix 系统将支持select.poll()
.
If you need cross-platform support, the standard way is to use select.select()
to check if the socket is marked as having data available to read. If it is, but recv()
returns zero bytes, the other end has hung up.
如果需要跨平台支持,标准方法是使用select.select()
检查套接字是否标记为有数据可供读取。如果是,但recv()
返回零字节,则另一端已挂断。
I've always found Beej's Guide to Network Programminggood (note it is written for C, but is generally applicable to standard socket operations), while the Socket Programming How-Tohas a decent Python overview.
我一直觉得Beej 的网络编程指南很好(注意它是为 C 编写的,但通常适用于标准套接字操作),而Socket Programming How-To有一个不错的 Python 概述。
Edit: The following is an example of how a simple server could be written to queue incoming commands but quit processing as soon as it finds the connection has been closed at the remote end.
编辑:以下是一个示例,说明如何将一个简单的服务器写入队列,但在发现远程端的连接已关闭时立即退出处理。
import select
import socket
import time
# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), 7557))
serversocket.listen(1)
# Wait for an incoming connection.
clientsocket, address = serversocket.accept()
print 'Connection from', address[0]
# Control variables.
queue = []
cancelled = False
while True:
# If nothing queued, wait for incoming request.
if not queue:
queue.append(clientsocket.recv(1024))
# Receive data of length zero ==> connection closed.
if len(queue[0]) == 0:
break
# Get the next request and remove the trailing newline.
request = queue.pop(0)[:-1]
print 'Starting request', request
# Main processing loop.
for i in xrange(15):
# Do some of the processing.
time.sleep(1.0)
# See if the socket is marked as having data ready.
r, w, e = select.select((clientsocket,), (), (), 0)
if r:
data = clientsocket.recv(1024)
# Length of zero ==> connection closed.
if len(data) == 0:
cancelled = True
break
# Add this request to the queue.
queue.append(data)
print 'Queueing request', data[:-1]
# Request was cancelled.
if cancelled:
print 'Request cancelled.'
break
# Done with this request.
print 'Request finished.'
# If we got here, the connection was closed.
print 'Connection closed.'
serversocket.close()
To use it, run the script and in another terminal telnet to localhost, port 7557. The output from an example run I did, queueing three requests but closing the connection during the processing of the third one:
要使用它,请运行脚本并在另一个终端 telnet 到 localhost,端口 7557。我运行的示例的输出,排队三个请求,但在处理第三个请求期间关闭连接:
Connection from 127.0.0.1
Starting request 1
Queueing request 2
Queueing request 3
Request finished.
Starting request 2
Request finished.
Starting request 3
Request cancelled.
Connection closed.
epoll alternative
epoll 替代品
Another edit:I've worked up another example using select.epoll
to monitor events. I don't think it offers much over the original example as I cannot see a way to receive an event when the remote end hangs up. You still have to monitor the data received event and check for zero length messages (again, I'd love to be proved wrong on this statement).
另一个编辑:我已经编写了另一个select.epoll
用于监视事件的示例。我不认为它比原始示例提供太多,因为我看不到远程端挂断时接收事件的方法。您仍然必须监视接收到的数据事件并检查零长度消息(同样,我希望在此声明中证明我是错误的)。
import select
import socket
import time
port = 7557
# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), port))
serversocket.listen(1)
serverfd = serversocket.fileno()
print "Listening on", socket.gethostname(), "port", port
# Make the socket non-blocking.
serversocket.setblocking(0)
# Initialise the list of clients.
clients = {}
# Create an epoll object and register our interest in read events on the server
# socket.
ep = select.epoll()
ep.register(serverfd, select.EPOLLIN)
while True:
# Check for events.
events = ep.poll(0)
for fd, event in events:
# New connection to server.
if fd == serverfd and event & select.EPOLLIN:
# Accept the connection.
connection, address = serversocket.accept()
connection.setblocking(0)
# We want input notifications.
ep.register(connection.fileno(), select.EPOLLIN)
# Store some information about this client.
clients[connection.fileno()] = {
'delay': 0.0,
'input': "",
'response': "",
'connection': connection,
'address': address,
}
# Done.
print "Accepted connection from", address
# A socket was closed on our end.
elif event & select.EPOLLHUP:
print "Closed connection to", clients[fd]['address']
ep.unregister(fd)
del clients[fd]
# Error on a connection.
elif event & select.EPOLLERR:
print "Error on connection to", clients[fd]['address']
ep.modify(fd, 0)
clients[fd]['connection'].shutdown(socket.SHUT_RDWR)
# Incoming data.
elif event & select.EPOLLIN:
print "Incoming data from", clients[fd]['address']
data = clients[fd]['connection'].recv(1024)
# Zero length = remote closure.
if not data:
print "Remote close on ", clients[fd]['address']
ep.modify(fd, 0)
clients[fd]['connection'].shutdown(socket.SHUT_RDWR)
# Store the input.
else:
print data
clients[fd]['input'] += data
# Run when the client is ready to accept some output. The processing
# loop registers for this event when the response is complete.
elif event & select.EPOLLOUT:
print "Sending output to", clients[fd]['address']
# Write as much as we can.
written = clients[fd]['connection'].send(clients[fd]['response'])
# Delete what we have already written from the complete response.
clients[fd]['response'] = clients[fd]['response'][written:]
# When all the the response is written, shut the connection.
if not clients[fd]['response']:
ep.modify(fd, 0)
clients[fd]['connection'].shutdown(socket.SHUT_RDWR)
# Processing loop.
for client in clients.keys():
clients[client]['delay'] += 0.1
# When the 'processing' has finished.
if clients[client]['delay'] >= 15.0:
# Reverse the input to form the response.
clients[client]['response'] = clients[client]['input'][::-1]
# Register for the ready-to-send event. The network loop uses this
# as the signal to send the response.
ep.modify(client, select.EPOLLOUT)
# Processing delay.
time.sleep(0.1)
Note: This only detects proper shutdowns. If the remote end just stops listening without sending the proper messages, you won't know until you try to write and get an error. Checking for that is left as an exercise for the reader. Also, you probably want to perform some error checking on the overall loop so the server itself is shutdown gracefully if something breaks inside it.
注意:这仅检测正确关闭。如果远程端只是停止侦听而没有发送正确的消息,那么在您尝试写入并收到错误之前您不会知道。检查这一点留给读者作为练习。此外,您可能希望对整个循环执行一些错误检查,以便在服务器内部发生故障时正常关闭服务器本身。
回答by shodanex
You can select with a timeout of zero, and read with the MSG_PEEK flag.
您可以选择超时为零,并使用 MSG_PEEK 标志读取。
I think you really should explain what you precisely mean by "not reading", and why the other answer are not satisfying.
我认为您真的应该解释“不阅读”的确切含义,以及为什么其他答案不令人满意。
回答by ninjalj
The socket KEEPALIVE option allows to detect this kind of "drop the connection without telling the other end" scenarios.
套接字 KEEPALIVE 选项允许检测这种“在不告诉另一端的情况下断开连接”的情况。
You should set the SO_KEEPALIVE option at SOL_SOCKET level. In Linux, you can modify the timeouts per socket using TCP_KEEPIDLE (seconds before sending keepalive probes), TCP_KEEPCNT (failed keepalive probes before declaring the other end dead) and TCP_KEEPINTVL (interval in seconds between keepalive probes).
您应该在 SOL_SOCKET 级别设置 SO_KEEPALIVE 选项。在 Linux 中,您可以使用 TCP_KEEPIDLE(发送保活探测之前的秒数)、TCP_KEEPCNT(在宣布另一端死机之前失败的保活探测)和 TCP_KEEPINTVL(保活探测之间的间隔以秒为单位)修改每个套接字的超时时间。
In Python:
在 Python 中:
import socket
...
s.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 5)
netstat -tanop
will show that the socket is in keepalive mode:
netstat -tanop
将显示套接字处于保活模式:
tcp 0 0 127.0.0.1:6666 127.0.0.1:43746 ESTABLISHED 15242/python2.6 keepalive (0.76/0/0)
while tcpdump
will show the keepalive probes:
whiletcpdump
将显示 keepalive 探针:
01:07:08.143052 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683438 848683188>
01:07:08.143084 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683438 848682438>
01:07:09.143050 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683688 848683438>
01:07:09.143083 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683688 848682438>
回答by Jesse Gordon
After struggling with a similar problem I found a solution that works for me, but it does require calling recv()
in non-blocking mode and trying to read data, like this:
在遇到类似问题后,我找到了一个对我有用的解决方案,但它确实需要recv()
在非阻塞模式下调用并尝试读取数据,如下所示:
bytecount=recv(connectionfd,buffer,1000,MSG_NOSIGNAL|MSG_DONTWAIT);
The nosignal tells it to not terminate program on error, and the dontwait tells it to not block.
In this mode, recv()
returns one of 3 possible types of responses:
nosignal 告诉它不要在出错时终止程序,dontwait 告诉它不要阻塞。在此模式下,recv()
返回 3 种可能的响应类型之一:
-1
if there is no data to read or other errors.0
if the other end has hung up nicely1
or more if there was some data waiting.
-1
如果没有要读取的数据或其他错误。0
如果另一端很好地挂断了1
或者更多,如果有一些数据等待。
So by checking the return value, if it is 0 then that means the other end hung up.
If it is -1
then you have to check the value of errno
. If errno
is equal to EAGAIN
or EWOULDBLOCK
then the connection is still believed to be alive by the server's tcp stack.
所以通过检查返回值,如果它是 0 则表示另一端挂断了。如果是,-1
则必须检查 的值errno
。如果errno
等于EAGAIN
或,EWOULDBLOCK
则服务器的 tcp 堆栈仍然认为该连接处于活动状态。
This solution would require you to put the call to recv()
into your intensive data processing loop -- or somewhere in your code where it would get called 10 times a second or whatever you like, thus giving your program knowledge of a peer who hangs up.
该解决方案需要您将调用recv()
放入密集的数据处理循环中——或者在代码中每秒调用 10 次或任何您喜欢的地方,从而让您的程序了解挂断电话的对等方。
This of course will do no good for a peer who goes away without doing the correct connection shutdown sequence, but any properly implemented tcp client will correctly terminate the connection.
这当然对没有执行正确的连接关闭序列就离开的对等方没有好处,但是任何正确实现的 tcp 客户端都会正确地终止连接。
Note also that if the client sends a bunch of data then hangs up, recv()
will probably have to read that data all out of the buffer before it'll get the empty read.
另请注意,如果客户端发送一堆数据然后挂断,recv()
则可能必须先从缓冲区中读取所有数据,然后才能获得空读取。