SQL 选择 COUNT(*) 的子查询而不运行它两次
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/581440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select COUNT(*) of subquery without running it twice
提问by
I've got a procedure to return a result set which is limited by page number and some other stuff. As an OUTPUT parameter I need to return a total amount of selected rows according to the parameters except the page number. So I have something like that:
我有一个返回结果集的过程,该结果集受页码和其他一些东西的限制。作为 OUTPUT 参数,我需要根据除页码之外的参数返回选定行的总数。所以我有这样的事情:
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position
FROM Items
WHERE Row2 = @Row2)
SELECT Id, Row1, Row2
FROM SelectedItems
WHERE Position BETWEEN @From AND @To
And then I need to set the OUTPUT parameter to the number of rows in the innerquery. I can just copy the query and count it, but this query could returns thousands of rows (and will be more in the future), so I am looking for method to do that with a good performance. I was thinking about table variables, is it a good idea? Or any other suggestions?
然后我需要将 OUTPUT 参数设置为innerquery中的行数。我可以只复制查询并对其进行计数,但此查询可能返回数千行(将来还会更多),因此我正在寻找具有良好性能的方法。我在考虑表变量,这是个好主意吗?或者有什么其他建议?
To be more specific, it's the Microsoft SQL Server 2008.
更具体地说,它是 Microsoft SQL Server 2008。
Thank you, Jan
谢谢你,简
采纳答案by badbod99
You can count the total rows as a separate column in your main query using COUNT(*). Like this:
您可以使用 COUNT(*) 将总行数作为主查询中的单独列进行计数。像这样:
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position,
COUNT(*) OVER () AS TotalRows
FROM Items
WHERE Row2 = @Row2)
SELECT Id, Row1, Row2
FROM SelectedItems
WHERE Position BETWEEN @From AND @To
This will return the count in your result set rather than in a output parameter, but that should fit your requirements. Otherwise, combine with a temp table:
这将返回结果集中的计数而不是输出参数,但这应该符合您的要求。否则,结合临时表:
DECLARE @tmp TABLE (Id int, RowNum int, TotalRows int);
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position,
COUNT(*) OVER () AS TotalRows
FROM Items
WHERE Row2 = @Row2)
INSERT @tmp
SELECT Id, Row1, Row2
FROM SelectedItems
WHERE Position BETWEEN @From AND @To
SELECT TOP 1 @TotalRows = TotalRows FROM @tmp
SELECT * FROM @tmp
You will find using a temp table for just your paged result will not use much memory (depending on your page size of course) and you're only keeping it live for a short period of time. Selecting the full result set from the temp table and selecting the TotalRows will only take a tiny bit longer.
您会发现仅对分页结果使用临时表不会占用太多内存(当然取决于您的页面大小),而且您只会将其保留一小段时间。从临时表中选择完整的结果集并选择 TotalRows 只需要一点点时间。
This will be much faster than running a totally separate query, which in my test (repeating the WITH) doubled the execution time.
这将比运行完全独立的查询快得多,在我的测试中(重复 WITH)使执行时间加倍。
回答by Mehrdad Afshari
I think you should do it in a separate query. While those two queries might look pretty much the same, but the way query optimizer deals with them would differ pretty significantly.
我认为您应该在单独的查询中执行此操作。虽然这两个查询可能看起来几乎相同,但查询优化器处理它们的方式会有很大不同。
Theoretically, SQL Server might not even go through all the rows in the subquery to be able to count it.
从理论上讲,SQL Server 甚至可能无法遍历子查询中的所有行才能对其进行计数。
回答by Tom H
I don't have access to my code base right now, but I believe that you can use COUNT() OVER (or a similar command) to return the total number of rows as part of the subquery. You can then return that as part of the final result set. It gets duplicated in every row, but that's a minor performance hit in my opinion for an application that is using paging and should have limited final results anyway.
我现在无法访问我的代码库,但我相信您可以使用 COUNT() OVER(或类似命令)返回总行数作为子查询的一部分。然后,您可以将其作为最终结果集的一部分返回。它会在每一行中重复,但在我看来,对于使用分页并且最终结果应该有限的应用程序来说,这是一个很小的性能损失。
In a couple hours I'll post the exact code.
几个小时后,我将发布确切的代码。
EDIT: Here's the line that I've used to generate the count. In the end our developers wanted a separate method to get the count by itself, so now I'm maintaining the search criteria in two places within the same stored procedure.
编辑:这是我用来生成计数的行。最后,我们的开发人员想要一个单独的方法来自行获取计数,所以现在我在同一个存储过程中的两个地方维护搜索条件。
COUNT(*) OVER (PARTITION BY '') AS TotalCount
Add that to your CTE and then you can select the TotalCount and it will be a column in each of your rows.
将其添加到您的 CTE,然后您可以选择 TotalCount,它将成为您每一行中的一列。
回答by Triynko
You MUST run the entire query, without limiting the range, at least once in order to get the full row count. Since you're going to do this anyway, you should select @@RowCount in order to output the total rows found, rather than overloading your data reader with a redundant count(*) column in every row.
您必须在不限制范围的情况下运行整个查询,至少一次才能获得完整的行数。由于您无论如何都要这样做,您应该选择 @@RowCount 以输出找到的总行数,而不是在每一行中使用冗余 count(*) 列使数据读取器过载。
1. When running NEW query for first time:
1.第一次运行NEW查询时:
select YOUR_COLUMNS
from YOUR_TABLE
where YOUR_SEARCH_CONDITION
order by YOUR_COLUMN_ORDERING_LIST;
select @@rowcount;
2. Only READ the first X rows
2. 只读取前 X 行
The above query avoids flooding your SqlDataReader with a redundant COUNT(*) column, that would otherwise be sent for every call to SqlDataReader.Read(). Since you are running the query for the first time... instead of selecting a range, just READ only the first X rows.This gives you exactly what you want... the full result count, the first X records, and efficient streaming of the result set without the redundant count column.
上面的查询避免了用冗余的 COUNT(*) 列淹没 SqlDataReader,否则每次调用 SqlDataReader.Read() 时都会发送该列。 由于您是第一次运行查询......而不是选择一个范围,只需读取前 X 行。这正是您想要的……完整的结果计数、第一个 X 记录以及没有冗余计数列的结果集的高效流。
3. For subsequent runs of the SAME query to get a subset of the results
3. 对于 SAME 查询的后续运行以获取结果的子集
select YOUR_COLUMNS
from (select YOUR_COLUMNS, ROW_NUMBER()
over(order by BY YOUR_COLUMN_ORDERING_LIST) as RowNum) Results
where Results.RowNum between @From and @To;
In any case, @@rowcount
is the most direct way to access the count on the first run of the query without limiting the result set (ur gonna want first X results anyway), without running a separate count() query, without using a temp table, and without including a redundant count() column.
在任何情况下,这@@rowcount
是在不限制结果集的情况下访问第一次查询计数的最直接方法(无论如何你都会想要第一个 X 结果),无需运行单独的 count() 查询,无需使用临时表,并且不包括多余的 count() 列。
回答by SqlRyan
Couldn't you just set the output variable to @@RowCount? This will get the rows affected by the last executed statement:
您不能将输出变量设置为@@RowCount 吗?这将获取受上次执行的语句影响的行:
SELECT stuff FROM mytable
SET @output = @@ROWCOUNT
This should give you what you need, and doesn't involve running the query again.
这应该给你你需要的东西,并且不涉及再次运行查询。