C# 通过 Web 服务返回大结果
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11804/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Returning Large Results Via a Webservice
提问by lomaxx
I'm working on a web service at the moment and there is the potential that the returned results could be quite large ( > 5mb).
我目前正在开发一个 Web 服务,返回的结果可能会很大(> 5mb)。
It's perfectly valid for this set of data to be this large and the web service can be called either sync or async, but I'm wondering what people's thoughts are on the following:
这组数据如此之大是完全有效的,并且可以将 Web 服务称为同步或异步,但我想知道人们对以下内容的看法:
If the connection is lost, the entire resultset will have to be regenerated and sent again. Is there any way I can do any sort of "resume" if the connection is lost or reset?
Is sending a result set this large even appropriate? Would it be better to implement some sort of "paging" where the resultset is generated and stored on the server and the client can then download chunks of the resultset in smaller amounts and re-assemble the set at their end?
如果连接丢失,则必须重新生成整个结果集并再次发送。如果连接丢失或重置,有什么办法可以做任何类型的“恢复”?
发送这么大的结果集是否合适?实现某种“分页”是否会更好,其中结果集生成并存储在服务器上,然后客户端可以下载较小数量的结果集块并在它们的最后重新组装该集?
采纳答案by DavidValeri
I have seen all three approaches, paged, store and retrieve, and massive push.
我已经看到了所有三种方法,分页、存储和检索以及大规模推送。
I think the solution to your problem depends to some extent on why your result set is so large and how it is generated. Do your results grow over time, are they calculated all at once and then pushed, do you want to stream them back as soon as you have them?
我认为您的问题的解决方案在某种程度上取决于您的结果集为何如此之大以及它是如何生成的。您的结果是否会随着时间的推移而增长,它们是否一次计算然后推送,您是否希望在获得它们后立即将它们流式传输回来?
Paging Approach
分页方法
In my experience, using a paging approach is appropriate when the client needs quick access to reasonably sized chunks of the result set similar to pages in search results. Considerations here are overall chattiness of your protocol, caching of the entire result set between client page requests, and/or the processing time it takes to generate a page of results.
根据我的经验,当客户端需要快速访问与搜索结果中的页面类似的合理大小的结果集块时,使用分页方法是合适的。这里要考虑的是协议的整体性、客户端页面请求之间整个结果集的缓存和/或生成结果页面所需的处理时间。
Store and retrieve
存储和检索
Store and retrieve is useful when the results are not random access and the result set grows in size as the query is processed. Issues to consider here are complexity for clients and if you can provide the user with partial results or if you need to calculate all results before returning anything to the client (think sorting of results from distributed search engines).
当结果不是随机访问并且结果集随着查询的处理而增长时,存储和检索很有用。这里要考虑的问题是客户端的复杂性,以及是否可以向用户提供部分结果,或者是否需要在将任何结果返回给客户端之前计算所有结果(考虑对来自分布式搜索引擎的结果进行排序)。
Massive Push
大规模推动
The massive push approach is almost certainly flawed. Even if the client needs all of the information and it needs to be pushed in a monolithic result set, I would recommend taking the approach of WS-ReliableMessaging
(either directly or through your own simplified version) and chunking your results. By doing this you
大规模推动的方法几乎肯定是有缺陷的。即使客户端需要所有信息并且需要将其推送到整体结果集中,我仍建议采用WS-ReliableMessaging
(直接或通过您自己的简化版本)并将结果分块的方法。通过这样做你
- ensure that the pieces reach the client
- can discard the chunk as soon as you get a receipt from the client
- can reduce the possible issues with memory consumption from having to retain 5MB of XML, DOM, or whatever in memory (assuming that you aren't processing the results in a streaming manner) on the server and client sides.
- 确保碎片到达客户
- 一旦您从客户端收到收据,就可以丢弃该块
- 可以减少可能的内存消耗问题,因为必须在服务器和客户端的内存中保留 5MB 的 XML、DOM 或任何内容(假设您没有以流方式处理结果)。
Like others have said though, don't do anything until you know your result set size, how it is generated, and overall performance to be actual issues.
就像其他人所说的那样,在您知道结果集大小、生成方式以及整体性能成为实际问题之前,请不要做任何事情。
回答by Leon Bambrick
There's no hard law against 5 Mb as a result set size. Over 400 Mb can be hard to send.
没有针对 5 Mb 作为结果集大小的硬性法律。超过 400 Mb 可能很难发送。
You'll automatically get async handlers (since you're using .net)
您将自动获得异步处理程序(因为您使用的是 .net)
implement some sort of "paging" where the resultset is generated and stored on the server and the client can then download chunks of the resultset in smaller amounts and re-assemble the set at their end
实现某种“分页”,其中结果集生成并存储在服务器上,然后客户端可以下载较小数量的结果集块并在它们的末尾重新组装该集
That's already happening for you -- it's called tcp/ip ;-) Re-implementing that could be overkill.
这已经发生在你身上——它被称为 tcp/ip ;-) 重新实现可能是矫枉过正。
Similarly --
相似地 -
entire resultset will have to be regenerated and sent again
整个结果集必须重新生成并再次发送
If it's MS-SQL, for example that is generating most of the resultset -- then re-generating it will take advantage of some implicit cacheing in SQL Server and the subsequent generations will be quicker.
例如,如果它是 MS-SQL,它会生成大部分结果集——那么重新生成它会利用 SQL Server 中的一些隐式缓存,并且后续生成会更快。
To some extent you can get away with not worrying about these problems, until they surface as 'real' problems -- because the platform(s) you're using take care of a lot of the performance bottlenecks for you.
在某种程度上,您可以不必担心这些问题,直到它们成为“真正的”问题——因为您使用的平台为您解决了许多性能瓶颈。
回答by Mike Stone
I somewhat disagree with secretGeek's comment:
我有点不同意 secretGeek 的评论:
That's already happening for you -- it's called tcp/ip ;-) Re-implementing that could be overkill.
这已经发生在你身上——它被称为 tcp/ip ;-) 重新实现可能是矫枉过正。
There are times when you may want to do just this, but really only from a UI perspective. If you implement some way to either stream the data to the client (via something like a pushlets mechanism), or chunk it into pages as you suggest, you can then load some really small subset on the client and then slowly build up the UI with the full amount of data.
有时您可能只想这样做,但实际上只是从 UI 的角度来看。如果您实现某种方式将数据流式传输到客户端(通过类似推送机制的东西),或者按照您的建议将其分块到页面中,那么您可以在客户端上加载一些非常小的子集,然后使用全部数据。
This makes for a slicker, speedier UI (from the user's perspective), but you have to evaluate if the extra effort will be worthwhile... because I don't think it will be an insignificant amount of work.
这使得 UI 更流畅、更快速(从用户的角度来看),但您必须评估额外的努力是否值得......因为我认为这不会是微不足道的工作量。
回答by Leon Bambrick
So it sounds like you'd be interested in a solution that adds 'starting record number' and 'final record number' parameter to your web method. (or 'page number' and 'results per page')
因此,听起来您对将“起始记录编号”和“最终记录编号”参数添加到您的 Web 方法的解决方案感兴趣。(或“页码”和“每页结果”)
This shouldn't be too hard if the backing store is sql server (or even mysql) as they have built in support for row numbering.
如果后备存储是 sql server(甚至 mysql),这应该不会太难,因为它们内置了对行编号的支持。
Despite this you should be able to avoid doing any session management on the server, avoid any explicit caching of the result set, and just rely on the backing store's caching to keep your life simple.
尽管如此,您应该能够避免在服务器上进行任何会话管理,避免对结果集进行任何显式缓存,并且仅依靠后备存储的缓存来保持您的生活简单。