高性能 C# 服务器套接字的技巧/技巧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/319732/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Tips / techniques for high-performance C# server sockets
提问by McKenzieG1
I have a .NET 2.0 server that seems to be running into scaling problems, probably due to poor design of the socket-handling code, and I am looking for guidance on how I might redesign it to improve performance.
我有一个 .NET 2.0 服务器,它似乎遇到了扩展问题,可能是由于套接字处理代码的设计不佳,我正在寻找有关如何重新设计它以提高性能的指导。
Usage scenario:50 - 150 clients, high rate (up to 100s / second) of small messages (10s of bytes each) to / from each client. Client connections are long-lived - typically hours. (The server is part of a trading system. The client messages are aggregated into groups to send to an exchange over a smaller number of 'outbound' socket connections, and acknowledgment messages are sent back to the clients as each group is processed by the exchange.) OS is Windows Server 2003, hardware is 2 x 4-core X5355.
使用场景:50 - 150 个客户端,向/从每个客户端的小消息(每个 10 字节)的高速率(高达 100 秒/秒)。客户端连接是长期存在的——通常是几个小时。(服务器是交易系统的一部分。客户端消息被聚合成组,通过较少数量的“出站”套接字连接发送到交易所,当每个组被交易所处理时,确认消息被发送回客户端.) 操作系统为 Windows Server 2003,硬件为 2 x 4 核 X5355。
Current client socket design:A TcpListener
spawns a thread to read each client socket as clients connect. The threads block on Socket.Receive
, parsing incoming messages and inserting them into a set of queues for processing by the core server logic. Acknowledgment messages are sent back out over the client sockets using async Socket.BeginSend
calls from the threads that talk to the exchange side.
当前客户端套接字设计:TcpListener
当客户端连接时,A产生一个线程来读取每个客户端套接字。线程在 上阻塞Socket.Receive
,解析传入的消息并将它们插入到一组队列中以供核心服务器逻辑处理。使用Socket.BeginSend
来自与交换端对话的线程的异步调用,通过客户端套接字将确认消息发送回。
Observed problems:As the client count has grown (now 60-70), we have started to see intermittent delays of up to 100s of milliseconds while sending and receiving data to/from the clients. (We log timestamps for each acknowledgment message, and we can see occasional long gaps in the timestamp sequence for bunches of acks from the same group that normally go out in a few ms total.)
观察到的问题:随着客户端数量的增加(现在是 60-70),我们开始看到在向客户端发送数据和从客户端接收数据时出现高达 100 毫秒的间歇性延迟。(我们记录每个确认消息的时间戳,我们可以看到时间戳序列中偶尔出现的来自同一组的确认串的长间隔,这些间隔通常在几毫秒内消失。)
Overall system CPU usage is low (< 10%), there is plenty of free RAM, and the core logic and the outbound (exchange-facing) side are performing fine, so the problem seems to be isolated to the client-facing socket code. There is ample network bandwidth between the server and clients (gigabit LAN), and we have ruled out network or hardware-layer problems.
整体系统 CPU 使用率低(< 10%),有大量空闲 RAM,核心逻辑和出站(面向交换)端表现良好,因此问题似乎与面向客户端的套接字代码隔离. 服务器和客户端(千兆局域网)之间有充足的网络带宽,我们已经排除了网络或硬件层的问题。
Any suggestions or pointers to useful resources would be greatly appreciated. If anyone has any diagnostic or debugging tips for figuring out exactly what is going wrong, those would be great as well.
任何有用资源的建议或指示将不胜感激。如果有人有任何诊断或调试技巧来确定到底出了什么问题,那也会很棒。
Note: I have the MSDN Magazine article Winsock: Get Closer to the Wire with High-Performance Sockets in .NET, and I have glanced at the Kodart "XF.Server" component - it looks sketchy at best.
注意:我有 MSDN 杂志文章Winsock:在 .NET 中使用高性能套接字走近线路,我看过 Kodart“XF.Server”组件 - 它最多看起来很粗略。
采纳答案by grepsedawk
A lot of this has to do with many threads running on your system and the kernel giving each of them a time slice. The design is simple, but does not scale well.
这在很大程度上与系统上运行的许多线程以及内核为每个线程提供一个时间片有关。设计很简单,但不能很好地扩展。
You probably should look at using Socket.BeginReceive which will execute on the .net thread pools (you can specify somehow the number of threads it uses), and then pushing onto a queue from the asynchronous callback ( which can be running in any of the .NET threads ). This should give you much higher performance.
您可能应该考虑使用 Socket.BeginReceive 它将在 .net 线程池上执行(您可以以某种方式指定它使用的线程数),然后从异步回调(可以在任何.NET 线程)。这应该会给你更高的性能。
回答by mjallday
I don't have an answer but to get more information I'd suggest sprinkling your code with timers and logging avg and max time taken for suspect operations like adding to the queue or opening a socket.
我没有答案,但要获得更多信息,我建议在您的代码中添加计时器并记录可疑操作(例如添加到队列或打开套接字)所花费的平均和最大时间。
At least that way you will have an idea of what to look at and where to begin.
至少这样你就会知道要看什么以及从哪里开始。
回答by Marc Gravell
A thread per client seems massively overkill, especially given the low overall CPU usage here. Normally you would want a small pool of threads to service all clients, using BeginReceive to wait for work async - then simply despatch the processing to one of the workers (perhaps simply by adding the work to a synchronized queue upon which all the workers are waiting).
每个客户端的线程似乎过大了,尤其是考虑到这里的总体 CPU 使用率较低。通常,您需要一个小的线程池来为所有客户端提供服务,使用 BeginReceive 等待异步工作 - 然后只需将处理发送给其中一个工作人员(也许只需将工作添加到所有工作人员都在等待的同步队列中) )。
回答by Luke Quinane
The Socket.BeginConnect
and Socket.BeginAccept
are definitely useful. I believe they use the ConnectEx
and AcceptEx
calls in their implementation. These calls wrap the initial connection negotiation and data transfer into one user/kernel transition. Since the initial send/recieve buffer is already ready the kernel can just send it off - either to the remote host or to userspace.
在Socket.BeginConnect
和Socket.BeginAccept
肯定有用。我相信他们在他们的实现中使用ConnectEx
和AcceptEx
调用。这些调用将初始连接协商和数据传输包装到一个用户/内核转换中。由于初始发送/接收缓冲区已经准备就绪,内核可以将其发送出去——要么发送到远程主机,要么发送到用户空间。
They also have a queue of listeners/connectors ready which probably gives a bit of boost by avoiding the latency involved with userspace accepting/receiving a connection and handing it off (and all the user/kernel switching).
他们还准备了一个侦听器/连接器队列,这可能会通过避免用户空间接受/接收连接并将其传递(以及所有用户/内核切换)所涉及的延迟而带来一些提升。
To use BeginConnect
with a buffer it appears that you have to write the initial data to the socket before connecting.
要BeginConnect
与缓冲区一起使用,您似乎必须在连接之前将初始数据写入套接字。
回答by John Dibling
I am not a C# guy by any stretch, but for high-performance socket servers the most scalable solution is to use I/O Completion Portswith a number of active threads appropriate for the CPU(s) the process s running on, rather than using the one-thread-per-connection model.
我无论如何都不是 C# 专家,但对于高性能套接字服务器,最具可扩展性的解决方案是使用I/O 完成端口和许多适合运行进程的 CPU 的活动线程,而不是使用一个线程每个连接模型。
In your case, with an 8-core machine you would want 16 total threads with 8 running concurrently. (The other 8 are basically held in reserve.)
在您的情况下,使用 8 核机器,您需要 16 个线程,其中 8 个并发运行。(其他8个基本都是备用。)
回答by John Dibling
Socket I/O performance has improved in .NET 3.5 environment. You can use ReceiveAsync/SendAsync instead of BeginReceive/BeginSend for better performance. Chech this out:
套接字 I/O 性能在 .NET 3.5 环境中得到改进。您可以使用 ReceiveAsync/SendAsync 而不是 BeginReceive/BeginSend 以获得更好的性能。看看这个:
回答by feroze
As others have suggested, the best way to implement this would be to make the client facing code all asynchronous. Use BeginAccept() on the TcpServer() so that you dont have to manually spawn a thread. Then use BeginRead()/BeginWrite() on the underlying network stream that you get from the accepted TcpClient.
正如其他人所建议的那样,实现这一点的最佳方法是使面向客户端的代码全部异步。在 TcpServer() 上使用 BeginAccept(),这样您就不必手动生成线程。然后在从接受的 TcpClient 获得的底层网络流上使用 BeginRead()/BeginWrite()。
However, there is one thing I dont understand here. You said that these are long lived connections, and a large number of clients. Assuming that the system has reached steady state, where you have your max clients (say 70) connected. You have 70 threads listening for the client packets. Then, the system should still be responsive. Unless your application has memory/handle leaks and you are running out of resources so that your server is paging. I would put a timer around the call to Accept() where you kick off a client thread and see how much time that takes. Also, I would start taskmanager and PerfMon, and monitor "Non Paged Pool", "Virtual Memory", "Handle Count" for the app and see whether the app is in a resource crunch.
但是,我在这里不明白一件事。你说这些是长期连接和大量客户。假设系统已达到稳定状态,您的最大客户端(例如 70 个)已连接。您有 70 个线程在侦听客户端数据包。然后,系统应该仍然响应。除非您的应用程序存在内存/句柄泄漏,并且您的资源不足,因此您的服务器正在分页。我会在对 Accept() 的调用周围放置一个计时器,您可以在其中启动客户端线程并查看需要多少时间。此外,我会启动 taskmanager 和 PerfMon,并监视应用程序的“非分页池”、“虚拟内存”、“句柄计数”,并查看应用程序是否处于资源紧缩状态。
While it is true that going Async is the right way to go, I am not convinced if it will really solve the underlying problem. I would monitor the app as I suggested and make sure there are no intrinsic problems of leaking memory and handles. In this regard, "BigBlackMan" above was right - you need more instrumentation to proceed. Dont know why he was downvoted.
虽然采用 Async 确实是正确的方法,但我不相信它是否真的能解决根本问题。我会按照我的建议监控应用程序,并确保没有内存和句柄泄漏的内在问题。在这方面,上面的“BigBlackMan”是对的——你需要更多的工具才能继续。不知道为什么他被否决了。
回答by Addys
Random intermittent ~250msec delays might be due to the Nagle algorithm used by TCP. Try disabling that and see what happens.
随机间歇性 ~250 毫秒延迟可能是由于 TCP 使用的 Nagle 算法造成的。尝试禁用它,看看会发生什么。
回答by Tom Thorne
One thing I would want to eliminate is that it isn't something as simple as the garbage collector running. If all your messages are on the heap, you are generating 10000 objects a second.
我想消除的一件事是它不像垃圾收集器运行那么简单。如果您的所有消息都在堆上,那么您每秒将生成 10000 个对象。
Take a read of Garbage Collection every 100 seconds
The only solution is to keep your messages off the heap.
唯一的解决方案是让您的消息不在堆中。
回答by user1496062
I had the same issue 7 or 8 years ago and 100ms to 1 sec pauses , the problem was Garbage Collection .. Had about 400 Meg in use from 4 gig BUT there were a lot of objects.
我在 7 或 8 年前遇到了同样的问题,暂停了 100 毫秒到 1 秒,问题是垃圾收集.. 4 演出中使用了大约 400 Meg,但有很多对象。
I ended up storing messages in C++ but you could use ASP.NET cache ( which used to use COM and moved them out of the heap )
我最终用 C++ 存储消息,但你可以使用 ASP.NET 缓存(它曾经使用 COM 并将它们移出堆)