SO_REUSEADDR(setsockopt 选项)的含义是什么 - Linux?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3229860/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 20:15:09  来源:igfitidea点击:

What is the meaning of SO_REUSEADDR (setsockopt option) - Linux?

linuxsocketsportip-addresssetsockopt

提问by Ray Templeton

From the man page:

从手册页:

SO_REUSEADDR Specifies that the rules used in validating addresses supplied to bind() should allow reuse of local addresses, if this is supported by the protocol. This option takes an int value. This is a Boolean option

SO_REUSEADDR 指定用于验证提供给 bind() 的地址的规则应该允许重用本地地址,如果协议支持的话。此选项采用 int 值。这是一个布尔选项

When should I use it? Why does "reuse of local addresses" give?

我应该什么时候使用它?为什么“重用本地地址”会给出?

回答by William Briand

SO_REUSEADDR allows your server to bind to an address which is in a
TIME_WAIT state.

SO_REUSEADDR 允许您的服务器绑定到处于
TIME_WAIT 状态的地址。

This socket option tells the kernel that even if this port is busy (in the TIME_WAIT state), go ahead and reuse it anyway. If it is busy, but with another state, you will still get an address already in use error. It is useful if your server has been shut down, and then restarted right away while sockets are still active on its port.

这个套接字选项告诉内核,即使这个端口很忙(在 TIME_WAIT 状态),仍然继续并重用它。如果它很忙,但处于其他状态,您仍然会收到地址已在使用错误。如果您的服务器已关闭,然后在其端口上的套接字仍处于活动状态时立即重新启动,这将很有用。

From unixguide.net

来自unixguide.net

回答by Warren Young

TCP's primary design goal is to allow reliable data communication in the face of packet loss, packet reordering, and — key, here — packet duplication.

TCP 的主要设计目标是在面对数据包丢失、数据包重新排序以及 — 此处的关键 — 数据包重复时允许可靠的数据通信。

It's fairly obvious how a TCP/IP network stack deals with all this while the connection is up, but there's an edge case that happens just after the connection closes. What happens if a packet sent right at the end of the conversation is duplicated and delayed, such that the 4-way shutdownpackets get to the receiver before the delayed packet? The stack dutifully closes down its connection. Then later, the delayed duplicate packet shows up. What should the stack do?

当连接建立时,TCP/IP 网络堆栈如何处理所有这些是相当明显的,但是在连接关闭之后会发生边缘情况。如果在对话结束时发送的数据包被复制和延迟,从而4 路关闭数据包在延迟数据包之前到达接收器,会发生什么情况?堆栈尽职尽责地关闭其连接。然后,延迟的重复数据包出现了。堆栈应该做什么?

More importantly, what should it do if a program with open sockets on a given IP address + TCP port combo closes its sockets, and then a brief time later, a program comes along and wants to listen on that same IP address and TCP port number? (Typical case: A program is killed and is quickly restarted.)

更重要的是,如果一个在给定 IP 地址 + TCP 端口组合上打开套接字的程序关闭它的套接字,然后在短时间内,一个程序出现并想要监听相同的 IP 地址和 TCP 端口号,它应该怎么做? (典型情况:程序被杀死并快速重新启动。)

There are a couple of choices:

有几个选择:

  1. Disallow reuse of that IP/port combo for at least 2 times the maximum time a packet could be in flight. In TCP, this is usually called the 2×MSLdelay. You sometimes also see 2×RTT, which is roughly equivalent.

    This is the default behavior of all common TCP/IP stacks. 2×MSL is typically between 30 and 120 seconds, and it shows up in netstatoutput as the TIME_WAITperiod. After that time, the stack assumes that any rogue packets have been dropped en routedue to expired TTLs, so that socket leaves the TIME_WAITstate, allowing that IP/port combo to be reused.

  2. Allow the new program to re-bind to that IP/port combo. In stacks with BSD socketsinterfaces — essentially all Unixes and Unix-like systems, plus Windows via Winsock— you have to ask for this behavior by setting the SO_REUSEADDRoption via setsockopt()before you call bind().

  1. 禁止重复使用该 IP/端口组合至少 2 倍数据包可以在飞行中的最长时间。在 TCP 中,这通常称为 2× MSL延迟。您有时还会看到 2× RTT,这大致相当。

    这是所有常见 TCP/IP 堆栈的默认行为。2×MSL 通常在 30 到 120 秒之间,它在netstat输出中显示为TIME_WAIT周期。在那之后,堆栈假设任何流氓数据包由于过期的TTLs已在路由中被丢弃,因此套接字离开状态,允许重复使用该 IP/端口组合。TIME_WAIT

  2. 允许新程序重新绑定到该 IP/端口组合。在具有BSD 套接字接口的堆栈中——基本上所有 Unix 和类 Unix 系统,以及通过Winsock 的 Windows——你必须SO_REUSEADDR通过setsockopt()在调用之前设置选项 via来请求这种行为bind()

SO_REUSEADDRis most commonly set in network server programs, since a common usage pattern is to make a configuration change, then be required to restart that program to make the change take effect. Without SO_REUSEADDR, the bind()call in the restarted program's new instance will fail if there were connections open to the previous instance when you killed it. Those connections will hold the TCP port in the TIME_WAITstate for 30-120 seconds, so you fall into case 1 above.

SO_REUSEADDR最常设置在网络服务器程序中,因为常见的使用模式是进行配置更改,然后需要重新启动该程序以使更改生效。没有SO_REUSEADDRbind()如果在您杀死前一个实例时有连接打开,则重新启动的程序的新实例中的调用将失败。这些连接会将 TCP 端口保持在该TIME_WAIT状态 30-120 秒,因此您属于上述情况 1。

The risk in setting SO_REUSEADDRis that it creates an ambiguity: the metadata in a TCP packet's headers isn't sufficiently unique that the stack can reliably tell whether the packet is stale and so should be dropped rather than be delivered to the new listener's socket because it was clearly intended for a now-dead listener.

设置的风险SO_REUSEADDR在于它会产生歧义:TCP 数据包标头中的元数据不够独特以至于堆栈无法可靠地判断数据包是否陈旧,因此应该丢弃而不是将其传送到新侦听器的套接字,因为它显然是为一个已经死了的听众准备的。

If you don't see that that is true, here's all the listening machine's TCP/IP stack has to work with per-connection to make that decision:

如果您不认为这是真的,这里是所有侦听机器的 TCP/IP 堆栈必须与每个连接一起工作才能做出该决定:

  1. Local IP:Not unique per-conn. In fact, our problem definition here says we're reusing the local IP, on purpose.

  2. Local TCP port:Ditto.

  3. Remote IP:The machine causing the ambiguity could re-connect, so that doesn't help disambiguate the packet's proper destination.

  4. Remote port:In well-behaved network stacks, the remote port of an outgoing connection isn't reused quickly, but it's only 16 bits, so you've got 30-120 seconds to force the stack to get through a few tens of thousands of choices and reuse the port. Computers could do work that fast back in the 1960s.

    If your answer to that is that the remote stack should do something like TIME_WAITon its side to disallow ephemeral TCP portreuse, that solution assumes that the remote host is benign. A malicious actor is free to reuse that remote port.

    I suppose the listener's stack could choose to strictly disallow connections from the TCP 4-tuple only, so that during the TIME_WAITstate a given remote host is prevented from reconnecting with the same remote ephemeral port, but I'm not aware of any TCP stack with that particular refinement.

  5. Local and remote TCP sequence numbers:These are also not sufficiently unique that a new remote program couldn't come up with the same values.

  1. 本地 IP:每个连接不是唯一的。事实上,我们这里的问题定义是故意重复使用本地 IP。

  2. 本地 TCP 端口:同上。

  3. 远程 IP:导致歧义的机器可能会重新连接,因此这无助于消除数据包的正确目的地的歧义。

  4. 远程端口:在行为良好的网络堆栈中,传出连接的远程端口不会被快速重用,但它只有 16 位,因此您有 30-120 秒的时间来强制堆栈通过几万选择并重用端口。早在 1960 年代,计算机就可以如此快速地工作。

    如果您对此的回答是远程堆栈应该TIME_WAIT在其一侧执行类似操作以禁止临时 TCP 端口重用,则该解决方案假定远程主机是良性的。恶意行为者可以自由地重用该远程端口。

    我想侦听器的堆栈可以选择仅严格禁止来自 TCP 4 元组的连接,以便在该TIME_WAIT状态期间防止给定的远程主机与相同的远程临时端口重新连接,但我不知道任何 TCP 堆栈与那种特别的细化。

  5. 本地和远程 TCP 序列号:它们也不够独特,以至于新的远程程序无法提供相同的值。

If we were re-designing TCP today, I think we'd integrate TLSor something like it as a non-optional feature, one effect of which is to make this sort of inadvertent and malicious connection hiHymaning impossible, but that requires adding large fields (128 bits and up) which wasn't at all practical back in 1981, when the document for the current version of TCP (RFC 793) was published.

如果我们今天重新设计 TCP,我认为我们会将TLS或类似的东西集成为一个非可选功能,其效果之一是使这种无意和恶意的连接劫持成为不可能,但这需要添加大字段(128 位及以上)这在 1981 年根本不实用,当时发布了当前版本的 TCP ( RFC 793)的文档。

Without such hardening, the ambiguity created by allowing re-binding during TIME_WAITmeans you can either a) have stale data intended for the old listener be misdelivered to a socket belonging to the new listener, thereby either breaking the listener's protocol or incorrectly injecting stale data into the connection; or b) new data for the new listener's socket mistakenly assigned to the old listener's socket and thus inadvertently dropped.

如果没有这种强化,允许重新绑定期间产生的歧义TIME_WAIT意味着您可以 a) 将用于旧侦听器的陈旧数据错误地传送到属于新侦听器的套接字,从而破坏侦听器的协议或错误地将陈旧数据注入连接; 或 b) 新侦听器套接字的新数据错误地分配给旧侦听器套接字并因此无意中丢失。

The safe thing to do is wait out the TIME_WAITperiod.

安全的做法是等待TIME_WAIT一段时间。

Ultimately, it comes down to a choice of costs: wait out the TIME_WAITperiod or take on the risk of unwanted data loss or inadvertent data injection.

最终,它归结为成本的选择:等待TIME_WAIT一段时间或承担意外数据丢失或无意数据注入的风险。

Many server programs take this risk, deciding that it's better to get the server back up immediately so as to not miss any more incoming connections than necessary.

许多服务器程序冒了这个风险,决定最好立即让服务器备份,以免错过任何不必要的传入连接。

This is not a universal choice. Many programs — even server programs requiring a restart to apply a settings change — choose instead to leave SO_REUSEADDRalone. The programmer may know these risks and is choosing to leave the default alone, or they may be ignorant of the issues but are getting the benefit of a wise default.

这不是一个普遍的选择。许多程序——甚至是需要重启才能应用设置更改的服务器程序——选择不理会SO_REUSEADDR。程序员可能知道这些风险并选择不理会默认值,或者他们可能不知道这些问题但正在从明智的默认值中受益。

Some network programs offer the user a choice among the configuration options, fobbing the responsibility off on the end user or sysadmin.

一些网络程序为用户提供了在配置选项中的选择,将责任推给了最终用户或系统管理员。

回答by Eric

When you create a socket, you don't really own it. The OS (TCP stack) creates it for you and gives you a handle (file descriptor) to access it. When your socket is closed, it take time for the OS to "fully close it" while it goes through several states. As EJP mentioned in the comments, the longest delay is usually from the TIME_WAIT state. This extra delay is required to handle edge cases at the very end of the termination sequence and make sure the last termination acknowledgement either got through or had the other side reset itself because of a timeout. Here you can findsome extra considerations about this state. The main considerations are pointed out as follow :

当您创建套接字时,您并不真正拥有它。操作系统(TCP 堆栈)为您创建它并为您提供一个句柄(文件描述符)来访问它。当您的套接字关闭时,操作系统在经历多个状态时需要时间“完全关闭它”。正如 EJP 在评论中提到的,最长的延迟通常来自 TIME_WAIT 状态。这个额外的延迟需要在终止序列的最后处理边缘情况,并确保最后一个终止确认通过或让另一端由于超时而自行重置。在这里,您可以找到有关此状态的一些额外注意事项。主要考虑如下:

Remember that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered.

请记住,如果可能,TCP 保证所有传输的数据都将被传送。当您关闭套接字时,服务器进入 TIME_WAIT 状态,以确保所有数据都已通过。当套接字关闭时,双方通过向对方发送消息来同意不再发送数据。这在我看来已经足够了,握手完成后,套接字应该关闭。问题是双重的。首先,没有办法确定最后一个 ack 是否成功通信。其次,网络上可能存在“流浪重复”,如果交付必须处理。

If you try to create multiple sockets with the same ip:port pair really quick, you get the "address already in use" error because the earlier socket will not have been fully released. Using SO_REUSEADDR will get rid of this error as it will override checks for any previous instance.

如果您尝试使用相同的 ip:port 对快速创建多个套接字,则会收到“地址已在使用中”错误,因为较早的套接字尚未完全释放。使用 SO_REUSEADDR 将消除此错误,因为它将覆盖对任何先前实例的检查。