SQL Server 查询的最大大小?IN条款?有没有更好的方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1869753/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Maximum size for a SQL Server Query? IN clause? Is there a Better Approach
提问by BuddyJoe
Possible Duplicate:
T-SQL WHERE col IN (…)
What is the maximum size for a SQL Server query? (# of characters)
SQL Server 查询的最大大小是多少?(字符数)
Max size for an IN clause? I think I saw something about Oracle having a 1000 item limit but you could get around this with ANDing 2 INs together. Similar issue in SQL Server?
IN 子句的最大大小?我想我看到了一些关于 Oracle 有 1000 个项目限制的东西,但你可以通过将 2 个 IN 放在一起来解决这个问题。SQL Server 中的类似问题?
UPDATESo what would be the best approach if I need to take say 1000 GUIDs from another system (Non Relational Database) and do a "JOIN in code' against the SQL Server? Is it to submit the list of 1000 GUIDs to an IN clause? Or is there another technique that works more efficiently?
更新那么,如果我需要从另一个系统(非关系数据库)获取 1000 个 GUID 并针对 SQL Server 执行“代码中的联接”,那么最好的方法是什么?是否将 1000 个 GUID 的列表提交给 IN 子句? 或者有没有其他更有效的技术?
I haven't tested this but I wonder if I could submit the GUIDs as an XML doc. For example
我尚未对此进行测试,但我想知道是否可以将 GUID 作为 XML 文档提交。例如
<guids>
<guid>809674df-1c22-46eb-bf9a-33dc78beb44a</guid>
<guid>257f537f-9c6b-4f14-a90c-ee613b4287f3</guid>
</guids>
and then do some kind of XQuery JOIN against the Doc and the Table. Less efficient than 1000 item IN clause?
然后对 Doc 和 Table 执行某种 XQuery JOIN。效率低于 1000 项 IN 子句?
采纳答案by Remus Rusanu
Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.
每个 SQL 批处理都必须符合批处理大小限制:65,536 * 网络数据包大小。
Other than that, your query is limited by runtime conditions. It will usually run out of stack size because x IN (a,b,c) is nothing but x=a OR x=b OR x=c which creates an expression tree similar to x=a OR (x=b OR (x=c)), so it gets very deep with a large number of OR. SQL 7 would hit a SO at about 10k values in the IN, but nowdays stacks are much deeper (because of x64), so it can go pretty deep.
除此之外,您的查询受运行时条件的限制。它通常会用完堆栈大小,因为 x IN (a,b,c) 只不过是 x=a OR x=b OR x=c,它创建了一个类似于 x=a OR (x=b OR (x =c)),因此它会因大量 OR 变得非常深。SQL 7 会在 IN 中的大约 10k 值处达到 SO ,但现在堆栈更深(因为 x64),因此它可以变得非常深。
Update
更新
You already found Erland's article on the topic of passing lists/arrays to SQL Server. With SQL 2008 you also have Table Valued Parameterswhich allow you to pass an entire DataTable as a single table type parameter and join on it.
您已经找到 Erland 关于将列表/数组传递给 SQL Server 的主题的文章。在 SQL 2008 中,您还有表值参数,它允许您将整个 DataTable 作为单个表类型参数传递并对其进行连接。
XML and XPath is another viable solution:
XML 和 XPath 是另一个可行的解决方案:
SELECT ...
FROM Table
JOIN (
SELECT x.value(N'.',N'uniqueidentifier') as guid
FROM @values.nodes(N'/guids/guid') t(x)) as guids
ON Table.guid = guids.guid;
回答by Andrew
The SQL Server Maximums are disclosed http://msdn.microsoft.com/en-us/library/ms143432.aspx(this is the 2008 version)
SQL Server 最大值已公开http://msdn.microsoft.com/en-us/library/ms143432.aspx(这是 2008 版本)
A SQL Query can be a varchar(max) but is shown as limited to 65,536 * Network Packet size, but even then what is most likely to trip you up is the 2100 parameters per query. If SQL chooses to parameterize the literal values in the in clause, I would think you would hit that limit first, but I havn't tested it.
SQL 查询可以是 varchar(max),但显示为限制为 65,536 * 网络数据包大小,但即便如此,最有可能让您失望的是每个查询的 2100 个参数。如果 SQL 选择参数化 in 子句中的文字值,我认为您会首先达到该限制,但我尚未对其进行测试。
Edit : Test it, even under forced parameteriztion it survived - I knocked up a quick test and had it executing with 30k items within the In clause. (SQL Server 2005)
编辑:测试它,即使在强制参数化下它也能幸存下来 - 我进行了一个快速测试并让它在 In 子句中使用 30k 项执行。(SQL Server 2005)
At 100k items, it took some time then dropped with:
在 100k 项时,花了一些时间然后下降:
Msg 8623, Level 16, State 1, Line 1 The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
消息 8623,级别 16,状态 1,第 1 行 查询处理器耗尽内部资源,无法生成查询计划。这是一种罕见的事件,仅适用于极其复杂的查询或引用大量表或分区的查询。请简化查询。如果您认为自己错误地收到了此消息,请联系客户支持服务以获取更多信息。
So 30k is possible, but just because you can do it - does not mean you should :)
所以 30k 是可能的,但仅仅因为你可以做到 - 并不意味着你应该:)
Edit : Continued due to additional question.
编辑:由于其他问题而继续。
50k worked, but 60k dropped out, so somewhere in there on my test rig btw.
50k 工作了,但 60k 掉了,所以在我的测试台上的某个地方顺便说一句。
In terms of how to do that join of the values without using a large in clause, personally I would create a temp table, insert the values into that temp table, index it and then use it in a join, giving it the best opportunities to optimse the joins. (Generating the index on the temp table will create stats for it, which will help the optimiser as a general rule, although 1000 GUIDs will not exactly find stats too useful.)
关于如何在不使用大 in 子句的情况下进行值的连接,我个人会创建一个临时表,将值插入该临时表中,对其进行索引,然后在连接中使用它,给它最好的机会优化连接。(在临时表上生成索引将为它创建统计信息,这将有助于优化器作为一般规则,尽管 1000 个 GUID 不会完全发现统计信息太有用。)
回答by gbn
Per batch, 65536 * Network Packet Sizewhich is 4k so 256 MB
每批次65536 * 网络数据包大小为 4k,因此为 256 MB
However, IN will stop way before that but it's not precise.
但是,IN 会在此之前停止,但它并不精确。
You end up with memory errors but I can't recall the exact error. A huge IN will be inefficient anyway.
您最终会出现内存错误,但我不记得确切的错误。无论如何,巨大的 IN 将是低效的。
Edit: Remus reminded me: the error is about "stack size"
编辑:Remus 提醒我:错误是关于“堆栈大小”
回答by DaveE
Can you load the GUIDs into a scratch table then do a
您可以将 GUID 加载到暂存表中然后执行
... WHERE var IN SELECT guid FROM #scratchtable