C++ 最大吞吐量的 UDP 数据包的最佳大小是多少?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/276058/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 14:16:00  来源:igfitidea点击:

What is the optimal size of a UDP packet for maximum throughput?

c++windowslinuxnetwork-programmingudp

提问by sep

I need to send packets from one host to another over a potentially lossynetwork. In order to minimize packet latency, I'm not considering TCP/IP. But, I wish to maximize the throughput uisng UDP. What should be the optimal size of UDP packet to use?

我需要通过潜在的有损网络将数据包从一台主机发送到另一台主机。为了最小化数据包延迟,我不考虑 TCP/IP。但是,我希望使用 UDP 最大化吞吐量。要使用的 UDP 数据包的最佳大小应该是多少?

Here are some of my considerations:

以下是我的一些考虑:

  • The MTU size of the switches in the network is 1500. If I use a large packet, for example 8192, this will cause fragmentation. Loss of one fragment will result in the loss of the entire packet, right?

  • If I use smaller packets, I'll incur the overhead of the UDP and IP header

  • If I use a really large packet, what is the largest that I can use? I read that the largest datagram size is 65507. What is the buffer size I should use to allow me to send such sizes? Would that help to bump up my throughput?

  • What are the typical maximum datagram size supported by the common OSes (eg. Windows, Linux, etc.)?

  • 网络中交换机的MTU大小是1500。如果我使用大数据包,例如8192,这将导致碎片。丢掉一个片段会导致整个数据包丢失,对吗?

  • 如果我使用较小的数据包,则会产生 UDP 和 IP 标头的开销

  • 如果我使用一个非常大的数据包,我可以使用的最大数据包是多少?我读到最大的数据报大小是 65507。我应该使用多少缓冲区大小来允许我发送这样的大小?这有助于提高我的吞吐量吗?

  • 常见操作系统(例如 Windows、Linux 等)支持的典型最大数据报大小是多少?

Updated:

更新:

Some of the receivers of the data are embedded systems for which TCP/IP stack is not implemented.

一些数据的接收器是未实现 TCP/IP 堆栈的嵌入式系统。

I know that this place is filled with people who are very adament about using what's available. But I hope to have better answers than just focusing on MTU alone.

我知道这个地方到处都是非常坚持使用可用资源的人。但我希望有比仅仅关注 MTU 更好的答案。

回答by CesarB

Alternative answer: be careful to not reinvent the wheel.

替代答案:小心不要重新发明轮子。

TCP is the product of decades of networking experience. There is a reson for every or almost every thing it does. It has several algorithms most people do not think about often (congestion control, retransmission, buffer management, dealing with reordered packets, and so on).

TCP 是数十年网络经验的产物。它所做的每件事或几乎每件事都会引起共鸣。它有几个大多数人不经常想到的算法(拥塞控制、重传、缓冲区管理、处理重新排序的数据包等)。

If you start reimplementing all the TCP algorithms, you risk ending up with an (paraphasing Greenspun's Tenth Rule) "ad hoc, informally-specified, bug-ridden, slow implementation of TCP".

如果您开始重新实现所有 TCP 算法,您可能会遇到(转述Greenspun 的第十条规则)“临时的、非正式指定的、漏洞百出的、缓慢的 TCP 实现”。

If you have not done so yet, it could be a good idea to look at some recent alternatives to TCP/UDP, like SCTP or DCCP. They were designed for niches where neither TCP nor UDP was a good match, precisely to allow people to use an already "debugged" protocol instead of reinventing the wheel for every new application.

如果您还没有这样做,那么查看 TCP/UDP 的一些最新替代方案可能是一个好主意,例如 SCTP 或 DCCP。它们是为 TCP 和 UDP 都不是很好匹配的利基而设计的,正是为了允许人们使用已经“调试”过的协议,而不是为每个新应用程序重新发明轮子。

回答by CesarB

The best way to find the ideal datagram size is to do exactly what TCP itself does to find the ideal packet size: Path MTU discovery.

找到理想数据报大小的最佳方法是完全按照 TCP 本身所做的来找到理想的数据包大小:路径 MTU 发现

TCP also has a widely used option where both sides tell the other what their MSS (basically, MTU minus headers) is.

TCP 也有一个广泛使用的选项,其中双方告诉对方他们的 MSS(基本上,MTU 减去标头)是什么。

回答by denis phillips

Another thing to consider is that some network devices don't handle fragmentation very well. We've seen many routers that drop fragmented UDP packets or packets that are too big. The suggestion by CesarBto use Path MTU is a good one.

另一件需要考虑的事情是一些网络设备不能很好地处理碎片。我们已经看到许多路由器丢弃碎片化的 UDP 数据包或太大的数据包。CesarB建议使用 Path MTU 是一个很好的建议。

Maximum throughput is not driven only by the packet size (though this contributes of course). Minimizing latency and maximizing throughput are often at odds with one other. In TCP you have the Nagle algorithm which is designed (in part) to increase overall throughput. However, some protocols (e.g., telnet) often disable Nagle (i.e., set the No Delay bit) in order to improve latency.

最大吞吐量不仅仅由数据包大小驱动(尽管这当然有贡献)。最小化延迟和最大化吞吐量通常是相互矛盾的。在 TCP 中,您有 Nagle 算法,该算法旨在(部分)增加整体吞吐量。然而,一些协议(例如telnet)经常禁用Nagle(即设置No Delay 位)以改善延迟。

Do you have some real time constraints for the data? Streaming audio is different than pushing non-realtime data (e.g., logging information) as the former benefits more from low latency while the latter benefits from increased throughput and perhaps reliability. Are there reliability requirements? If you can't miss packets and have to have a protocol to request retransmission, this will reduce overall throughput.

您对数据有一些实时限制吗?流式音频不同于推送非实时数据(例如,日志信息),因为前者更多地受益于低延迟,而后者受益于增加的吞吐量和可靠性。是否有可靠性要求?如果您不能错过数据包并且必须有一个协议来请求重传,这将降低整体吞吐量。

There are a myriad of other factors that go into this and (as was suggested in another response) at some point you get a bad implementation of TCP. That being said, if you want to achieve low latency and can tolerate loss using UDP with an overall packet size set to the PATH MTU (be sure to set the payload size to account for headers) is likely the optimal solution (esp. if you can ensure that UDP can get from one end to the other.

还有无数其他因素会影响到这一点,并且(正如在另一个回复中所建议的那样)在某些时候你会得到一个糟糕的 TCP 实现。话虽如此,如果您想实现低延迟并且可以容忍使用 UDP 的丢失,并且总体数据包大小设置为 PATH MTU(确保将有效负载大小设置为考虑标头)可能是最佳解决方案(尤其是如果您可以保证UDP可以从一端到达另一端。

回答by Robert S. Barnes

Well, I've got a non-MTU answer for you. Using a connected UDP socket should speed things up for you. There are two reasons to call connect on your UDP socket. The first is efficiency. When you call sendto on an unconnected UDP socket what happens is that the kernel temporarily connects the socket, sends the data and then disconnects it. I read about a study indicating that this takes up nearly 30% of processing time when sending. The other reason to call connect is so that you can get ICMP error messages. On an unconnected UDP socket the kernel doesn't know what application to deliver ICMP errors to and so they just get discarded.

嗯,我有一个非 MTU 的答案给你。使用连接的 UDP 套接字应该会为您加快速度。在 UDP 套接字上调用 connect 有两个原因。首先是效率。当您在未连接的 UDP 套接字上调用 sendto 时,内核会临时连接套接字,发送数据,然后断开连接。我读到一项研究表明,这在发送时占用了近 30% 的处理时间。调用 connect 的另一个原因是您可以获得 ICMP 错误消息。在未连接的 UDP 套接字上,内核不知道将 ICMP 错误传递给哪个应用程序,因此它们只会被丢弃。

回答by qwerty_ca

Uhh Jason, TCP does notuse UDP. TCP uses IP, which is why you often see it referred to as TCP/IP. UDP also uses IP, so UDP is technically UDP/IP. The IP layer handles the transfer of data from end to end (across different networks), which is why it is called the Inter-networkingProtocol. TCP and UDP handle the segmentation of the data itself. The lower layers such as Ethernet or PPP or whatever else you use handle computer-to-computer data transfer (that is, within a single network).

呃杰森,TCP使用 UDP。TCP 使用 IP,这就是您经常看到它被称为 TCP/IP 的原因。UDP 也使用 IP,因此 UDP 在技术上是 UDP/IP。IP 层处理端到端(跨不同网络)的数据传输,这就是它被称为网络间协议的原因。TCP 和 UDP 处理数据本身的分段。较低的层,例如以太网或 PPP 或您使用的任何其他层,处理计算机到计算机的数据传输(即,在单个网络中)。

回答by Syaiful Nizam Yahya

The easiest workaround to find mtu in c# is to send udp packets with dontfragment flag set to true. if it throws an exception, try reduce the packet size. do this until there is no exception thrown. you can start with 1500 packet size.

在 c# 中找到 mtu 的最简单的解决方法是发送 udp 数据包,并将 dontfragment 标志设置为 true。如果它抛出异常,请尝试减小数据包大小。这样做直到没有抛出异常。您可以从 1500 个数据包大小开始。

回答by Malkocoglu

IP header is >= 20 bytes but mostly 20 and UDP header is 8 bytes. This leaves you 1500 - 28 = 1472 bytes for you data. PATH MTU discovery finds the smallest possible MTU on the way to destination. But this does not necessarily mean that, when you use the smallest MTU, you will get the best possible performance. I think the best way is to do a benchmark. Or maybe you should not care about the smallest MTU on the way at all. A network device may very well use a small MTU and also transfer packets very fast. And its value may very well change in the future. So you can not discover this and save it somewhere to use later on, you have to do it periodically. If I were you, I would set the MTU to something like 1440 and benchmark the application...

IP 标头 >= 20 个字节,但大部分为 20 个,UDP 标头为 8 个字节。这为您的数据留下了 1500 - 28 = 1472 个字节。PATH MTU 发现在到达目的地的途中找到可能的最小 MTU。但这并不一定意味着,当您使用最小的 MTU 时,您将获得可能的最佳性能。我认为最好的方法是做一个基准测试。或者也许您根本不应该关心途中最小的 MTU。网络设备很可能使用较小的 MTU,并且传输数据包的速度也非常快。它的价值在未来很可能会发生变化。因此,您无法发现这一点并将其保存在某个地方以供以后使用,您必须定期进行。如果我是你,我会将 MTU 设置为 1440 之类的值并对应用程序进行基准测试...

回答by Tim Howland

Even though the MTU at the switch is 1500, you can have situations (like tunneling through a VPN) that wrap a few extra headers around the packet- you may do better to reduce them slightly, and go at 1450 or so.

即使交换机上的 MTU 为 1500,您也可能会遇到在数据包周围包裹一些额外标头的情况(例如通过 VPN 进行隧道传输)——您最好稍微减少它们,并在 1450 左右。

Can you simulate the network and test performance with different packet sizes?

您可以模拟网络并测试不同数据包大小的性能吗?

回答by Tim Howland

The "Stack" is (TCP uses(UDP uses(IPv4 uses (ETHERNET))))... or The "Stack" is (TCP uses(UDP uses(IPv6 uses (ETHERNET))))...

“堆栈”是(TCP使用(UDP使用(IPv4使用(ETHERNET))))...或“堆栈”是(TCP使用(UDP使用(IPv6使用(ETHERNET))))...

All those headers are added in TCP. IPv6 is just dumb. Every computer does not require its own IP. IPv6 is just undesired packet bloat. You have 65,000+ ports, you will not use them all, ever... Add that to the individual machine MAC address in the ETHERNET header, and you have gazillions of addresses.

所有这些标头都添加到 TCP 中。IPv6 只是愚蠢。每台计算机都不需要自己的 IP。IPv6 只是不希望的数据包膨胀。您有 65,000 多个端口,您永远不会全部使用它们……将其添加到 ETHERNET 标头中的单个机器 MAC 地址,您将拥有无数个地址。

Focus on the (UDP uses(IPv4 uses(ETHERNET))) headers, and all will be fine. Your program should be able to "Check" packet size, by receiving a 65,000 byte buffer over UDP, set as all NULL CHR(0), and sending a 65,000 packet of CHR(255) bytes. You can see if your UDP data was lost, because you will never get it. It will be cut short. UDP does not transmit multiple packets. You send one, you get one. You just get less if it can't fit. Or you get nothing, if it gets dropped.

专注于 (UDP uses(IPv4 uses(ETHERNET))) 标头,一切都会好起来的。您的程序应该能够“检查”数据包大小,方法是通过 UDP 接收 65,000 字节缓冲区,将其设置为所有 NULL CHR(0),然后发送 65,000 字节的 CHR(255) 字节数据包。您可以查看您的 UDP 数据是否丢失,因为您永远不会得到它。它会被剪短。UDP 不传输多个数据包。你送一个,你得到一个。如果它不适合,你只会得到更少。或者你什么也得不到,如果它被丢弃了。

TCP will hold your connections in purgatory until all data is received. It is using UDP packets, and telling the other computer to resend those missing packets. That comes with additional overhead, and causes LAG if any packet is dropped, lost, short, or out of order.

TCP 将使您的连接处于炼狱状态,直到收到所有数据。它使用 UDP 数据包,并告诉另一台计算机重新发送那些丢失的数据包。这会带来额外的开销,如果任何数据包被丢弃、丢失、短路或无序,就会导致 LAG。

UDP gives you full control. Use UDP if you send "Critical" and "Non-Critical" data, and want to use a reduced packet-order number system, that is not dependant on sequential arrival. Only use TCP for WEB or SECURE solid data, that requires persistence and 100% completeness. Otherwise, you are just wasting our web-bandwidth, and adding bloated clutter to the net. The smaller your data-stream, the less you will loose along the way. Use TCP, and you will guarantee additional LAG related to all the resending, and bloated headers that are added onto the TCP header, for "Flow control".

UDP 为您提供完全控制权。如果您发送“关键”和“非关键”数据,并希望使用不依赖于顺序到达的减少的数据包顺序编号系统,请使用 UDP。仅将 TCP 用于 WEB 或 SECURE 实体数据,这需要持久性和 100% 完整性。否则,您只是在浪费我们的网络带宽,并为网络添加臃肿的混乱。您的数据流越小,您在此过程中的松散就越少。使用 TCP,您将保证与所有重发相关的额外 LAG,以及添加到 TCP 标头上的膨胀标头,用于“流量控制”。

Seriously, flow control is not that hard to manage, nor is priority, and missing data detection. TCP offers nothing. That is why it is given away for free. It is not seasoned, it is just blindly stupid and easy. It is an old pair of flip-flops. You need a good pair of sneakers. TCP was, and still is, a hack.

说真的,流量控制并不难管理,优先级和丢失数据检测也不是那么难。TCP 什么都不提供。这就是它免费赠送的原因。它没有经验,它只是盲目愚蠢和容易。这是一双旧的人字拖。你需要一双好的运动鞋。TCP 过去是,现在仍然是黑客。