SELECT 查询中的默认行顺序 - SQL Server 2008 与 SQL 2012
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26236352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Default row order in SELECT query - SQL Server 2008 vs SQL 2012
提问by Channs
Our team recently upgraded our databases from SQL Server 2008 to SQL Server 2012. One breaking change we noticed was in the default order of rows returned by the SELECT statement, i.e. when an explicit ORDER BY clause is not specified.
我们的团队最近将我们的数据库从 SQL Server 2008 升级到 SQL Server 2012。我们注意到的一个重大变化是 SELECT 语句返回的行的默认顺序,即未指定显式 ORDER BY 子句时。
As per MSDN, SQL Server 2012does not gaurantee the order of the rows returned unless an ORDER BY clause is specified.
根据 MSDN,除非指定了 ORDER BY 子句,否则SQL Server 2012不保证返回的行的顺序。
We have 2500+ stored procedures across 5 databases that have SELECT statements without an ORDER BY clause and it will be a sizeable effort to add the ORDER BY clause manually to match the behavior in SQL Server 2008. Is there a setting or faster way of doing this?
我们在 5 个数据库中有 2500 多个存储过程,这些存储过程具有不带 ORDER BY 子句的 SELECT 语句,手动添加 ORDER BY 子句以匹配 SQL Server 2008 中的行为将是一项巨大的工作。是否有设置或更快的方法这个?
The other option, which hasn't been explored, is to downgrade to SQL Server 2008. How difficult would this be?
另一个尚未探索的选项是降级到 SQL Server 2008。这会有多困难?
回答by Mike D.
You need to go back and add ORDER BY
clauses to your code because without them the order is never guaranteed. You were "lucky" in the past that you always got the same order but it wasn't because SQL Server 2008 guaranteed it in anyway. It most likely had to do with your indexes or how the data was being stored on the disk.
您需要返回并ORDER BY
在代码中添加子句,因为没有它们,顺序永远无法保证。过去您很“幸运”,您总是得到相同的订单,但这并不是因为 SQL Server 2008 无论如何都保证了它。它很可能与您的索引或数据如何存储在磁盘上有关。
If you moved to a new host when you upgraded the difference in hardware configuration alone could have changed the way your queries execute. Not to mention the fact that the new server would have recalculated statistics on the tables and the SQL Server 2012 query optimizer probably does things a bit differently than the one in SQL Server 2008.
如果您在升级时迁移到新主机,仅硬件配置的差异就可能改变查询的执行方式。更不用说新服务器会重新计算表的统计信息这一事实,而且 SQL Server 2012 查询优化器的处理方式可能与 SQL Server 2008 中的有所不同。
It is a fallacy that you can rely on the order of a result set in SQL without explicitly stating the order you want it in. SQL results NEVERhave an order you can rely on without using an ORDER BY
clause. SQL is built around set theory. Query results are basically sets (or multi-sets).
您可以依赖 SQL 中结果集的顺序而无需明确说明您希望它的顺序,这是一种谬论。SQL 结果永远不会有不使用ORDER BY
子句就可以依赖的顺序。SQL 是围绕集合论构建的。查询结果基本上是集合(或多集合)。
Itzik Ben-Gan gives a good description of set theory in relation to SQL in his book Microsoft SQL Server 2012 T-SQL Fundamentals
Itzik Ben-Gan 在他的书Microsoft SQL Server 2012 T-SQL Fundamentals 中很好地描述了与 SQL 相关的集合论
Set theory, which originated with the mathematician Georg Cantor, is one of the mathematical branches on which the relational model is based. Cantor's definition of a set follows:
By a "set" we mean any collection M into a whole of definite, distinct objects m (which are called the "elements" of M) of our perception or of our thought. - Joseph W. Dauben and Georg Cantor (Princeton University Press, 1990)
集合论起源于数学家乔治·康托尔,是关系模型所基于的数学分支之一。康托尔对集合的定义如下:
我们所说的“集合”是指任何集合 M 到我们的感知或我们的思想的确定的、不同的对象 m(称为 M 的“元素”)的整体中。- Joseph W. Dauben 和 Georg Cantor(普林斯顿大学出版社,1990 年)
After a thorough explanation of the terms in the definition Itzik then goes on to say:
在对定义中的术语进行彻底解释之后,Itzik 继续说:
What Cantor's definition of a set leaves out is probably as important as what it includes. Notice that the definition doesn't mention any order among the set elements. The order in which set elements are listed is not imporant. The formal notation for listing set elements uses curly brackets: {a, b, c}. Because order has no relevance you can express the same set as {b, a, c} or {b, c, a}. Jumping ahead to the set of attributes (called columns in SQL) that make up the header of a relation (called a table in SQL), an element is supposed to be identified by name - not ordinal position. Similarly, consider the set of tuples (called rows by SQL) that make up the body of the relation; an element is identified by its key values - not by position. Many programmers have a hard time adapting to the idea that, with respect to querying tables, there is no order among the rows. In other words, a query against a table can return rows in any orderunless you explicitly request that the data be sorted in a specific way, perhaps for presentation purposes.
康托尔对集合的定义所遗漏的内容可能与其包含的内容一样重要。请注意,该定义没有提到集合元素之间的任何顺序。列出集合元素的顺序并不重要。列出集合元素的正式表示法使用大括号:{a, b, c}。因为顺序没有相关性,您可以将相同的集合表示为 {b, a, c} 或 {b, c, a}。跳转到构成关系标题(在 SQL 中称为表)的属性集(在 SQL 中称为列),元素应该由名称标识 - 而不是序数位置。类似地,考虑构成关系主体的一组元组(SQL 称为行);一个元素是由它的键值来标识的——而不是由位置来标识的。许多程序员很难适应这样的想法,换句话说,对表的查询可以以任何顺序返回行,除非您明确要求以特定方式对数据进行排序,也许是出于演示目的。
But regardless of the academic definition of a set even the implementation in SQL server has never guaranteed any order in the results. This MSDN blog post from 2005 by a member of the query optimizer teamstates that you should not rely on the order from intermediate operations at all.
但是不管集合的学术定义如何,即使在 SQL Server 中的实现也从未保证结果中的任何顺序。查询优化器团队的一名成员在 2005 年发表的这篇MSDN 博客文章指出,您根本不应该依赖中间操作的顺序。
The reordering rules can and will violate this assumption (and do so when it is inconvenient to you, the developer ;). Please understand that when we reorder operations to find a more efficient plan, we can cause the ordering behavior to change for intermediate nodes in the tree. If you've put an operation in the tree that assumes a particular intermediate ordering, it can break.
重新排序规则可以也将会违反这个假设(并且在您,开发人员不方便时这样做;)。请理解,当我们重新排序操作以找到更有效的计划时,我们可能会导致树中中间节点的排序行为发生变化。如果您在树中放置了一个假定特定中间顺序的操作,它可能会中断。
This blog post by Conor Cunningham (Architect, SQL Server Core Engine) "No Seatbelt - Expecting Order without ORDER BY" is about SQL Server 2008. He has a table with 20k rows in it with a single index that appears to always return rows in the same order. Adding an ORDER BY
to the query doesn't even change the execution plan, so it isn't like adding one in makes the query more expensive if the optimizer realizes it doesn't need it. But once he adds another 20k rows to the table suddenly the query plan changes and now it uses parallelism and the results are no longer ordered!
Conor Cunningham(架构师,SQL Server 核心引擎)的这篇博文“ No Seatbelt - Expecting Order without ORDER BY”是关于 SQL Server 2008 的。他有一个包含 20k 行的表,其中有一个索引似乎总是在相同的顺序。向ORDER BY
查询添加一个甚至不会改变执行计划,因此如果优化器意识到它不需要它,那么添加一个不会使查询更昂贵。但是一旦他向表中添加了另外 20k 行,查询计划就会发生变化,现在它使用并行性并且结果不再排序!
The hard part here is that there is no reasonable way for any external user to know when a plan will change . The space of all plans is huge and hurts your head to ponder. SQL Server's optimizer will change plans, even for simple queries, if enough of the parameters change. You may get lucky and not have a plan change, or you can just not think about this problem and add an ORDER BY.
这里的难点在于,任何外部用户都无法通过合理的方式知道计划何时会发生变化。所有计划的空间都很大,令人头疼。如果更改了足够多的参数,SQL Server 的优化器将更改计划,即使对于简单查询也是如此。你可能很幸运,没有改变计划,或者你可以不考虑这个问题并添加一个 ORDER BY。
If you need more convincing just read these posts:
如果您需要更有说服力,请阅读以下帖子: