在 SELECT 语句之后需要行计数:最佳 SQL 方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/243782/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Need a row count after SELECT statement: what's the optimal SQL approach?
提问by antik
I'm trying to select a column from a single table (no joins) and I need the count of the number of rows, ideally before I begin retrieving the rows. I have come to two approaches that provide the information I need.
我正在尝试从单个表中选择一列(无连接),我需要行数的计数,最好在我开始检索行之前。我已经找到了两种提供我需要的信息的方法。
Approach 1:
方法一:
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
Then
然后
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
Or Approach 2
或方法2
SELECT my_table.my_col, ( SELECT COUNT ( my_table.my_col )
FROM my_table
WHERE my_table.foo = 'bar' ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
I am doing this because my SQL driver (SQL Native Client 9.0) does not allow me to use SQLRowCount on a SELECT statement but I need to know the number of rows in my result in order to allocate an array before assigning information to it. The use of a dynamically allocated container is, unfortunately, not an option in this area of my program.
我这样做是因为我的 SQL 驱动程序(SQL Native Client 9.0)不允许我在 SELECT 语句上使用 SQLRowCount 但我需要知道结果中的行数以便在分配信息之前分配一个数组。不幸的是,使用动态分配的容器在我的程序的这方面不是一个选项。
I am concerned that the following scenario might occur:
我担心可能会发生以下情况:
- SELECT for count occurs
- Another instruction occurs, adding or removing a row
- SELECT for data occurs and suddenly the array is the wrong size.
-In the worse case, this will attempt to write data beyond the arrays limits and crash my program.
- SELECT 计数发生
- 出现另一条指令,添加或删除一行
- SELECT 数据发生,突然数组大小错误。
- 在更糟糕的情况下,这将尝试写入超出数组限制的数据并使我的程序崩溃。
Does Approach 2 prohibit this issue?
方法 2 是否禁止此问题?
Also, Will one of the two approaches be faster? If so, which?
另外,这两种方法之一会更快吗?如果是,是哪个?
Finally, is there a better approach that I should consider (perhaps a way to instruct the driver to return the number of rows in a SELECT result using SQLRowCount?)
最后,是否有我应该考虑的更好的方法(也许是一种指示驱动程序使用 SQLRowCount 返回 SELECT 结果中的行数的方法?)
For those that asked, I am using Native C++ with the aforementioned SQL driver (provided by Microsoft.)
对于那些询问的人,我使用的是带有上述 SQL 驱动程序(由 Microsoft 提供)的 Native C++。
采纳答案by Bill Karwin
There are only two ways to be 100% certain that the COUNT(*)
and the actual query will give consistent results:
只有两种方法可以 100% 确定COUNT(*)
查询和实际查询将给出一致的结果:
- Combined the
COUNT(*)
with the query, as in your Approach 2. I recommend the form you show in your example, not the correlated subquery form shown in the comment from kogus. - Use two queries, as in your Approach 1, after starting a transaction in
SNAPSHOT
orSERIALIZABLE
isolation level.
- 将
COUNT(*)
与查询结合起来,就像在您的方法 2 中一样。我推荐您在示例中显示的表单,而不是 kogus 评论中显示的相关子查询表单。 - 在
SNAPSHOT
或SERIALIZABLE
隔离级别启动事务后,使用两个查询,如方法 1 中所述。
Using one of those isolation levels is important because any other isolation level allows new rows created by other clients to become visible in your current transaction. Read the MSDN documentation on SET TRANSACTION ISOLATION
for more details.
使用这些隔离级别之一很重要,因为任何其他隔离级别都允许其他客户端创建的新行在您的当前事务中可见。有关SET TRANSACTION ISOLATION
更多详细信息,请阅读 MSDN 文档。
回答by Adam Porad
If you're using SQL Server, after your query you can select the @@RowCountfunction (or if your result set might have more than 2 billion rows use the RowCount_Big()function). This will return the number of rows selected by the previous statement or number of rows affected by an insert/update/delete statement.
如果您使用的是 SQL Server,则在查询之后您可以选择@@RowCount函数(或者如果您的结果集可能有超过 20 亿行,请使用RowCount_Big()函数)。这将返回由前一个语句选择的行数或受插入/更新/删除语句影响的行数。
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
SELECT @@Rowcount
Or if you want to row count included in the result sent similar to Approach #2, you can use the the OVER clause.
或者,如果您想将行计数包含在类似于方法 #2 的发送结果中,您可以使用OVER 子句。
SELECT my_table.my_col,
count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Using the OVER clause will have much better performance than using a subquery to get the row count. Using the @@RowCount will have the best performance because the there won't be any query cost for the select @@RowCount statement
使用 OVER 子句将比使用子查询获取行数具有更好的性能。使用@@RowCount 将获得最佳性能,因为 select @@RowCount 语句不会有任何查询成本
Update in response to comment: The example I gave would give the # of rows in partition - defined in this case by "PARTITION BY my_table.foo". The value of the column in each row is the # of rows with the same value of my_table.foo. Since your example query had the clause "WHERE my_table.foo = 'bar'", all rows in the resultset will have the same value of my_table.foo and therefore the value in the column will be the same for all rows and equal (in this case) this the # of rows in the query.
响应评论更新:我给出的示例将给出分区中的行数 - 在这种情况下由“PARTITION BY my_table.foo”定义。每行中列的值是与 my_table.foo 值相同的行数。由于您的示例查询具有子句“WHERE my_table.foo = 'bar'”,结果集中的所有行都将具有相同的 my_table.foo 值,因此列中的值对于所有行都相同且相等(在在这种情况下)这是查询中的行数。
Here is a better/simpler example of how to include a column in each row that is the total # of rows in the resultset. Simply remove the optional Partition By clause.
这是一个更好/更简单的示例,说明如何在每行中包含一列,即结果集中的总行数。只需删除可选的 Partition By 子句即可。
SELECT my_table.my_col, count(*) OVER() AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
回答by JosephStyons
Approach 2 will always return a count that matches your result set.
方法 2 将始终返回与您的结果集匹配的计数。
I suggest you link the sub-query to your outer query though, to guarantee that the condition on your count matches the condition on the dataset.
我建议您将子查询链接到外部查询,以确保您的计数条件与数据集条件匹配。
SELECT
mt.my_row,
(SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';
回答by Joe Pineda
If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:
如果您担心自执行查询和检索结果以来满足条件的行数可能会在几毫秒内发生变化,您可以/应该在事务中执行查询:
BEGIN TRAN bogus
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus
This would return the correct values, always.
这将始终返回正确的值。
Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of realquery to a temp table or table variable, so you can return everything altogether, and no need of a transaction:
此外,如果您使用的是 SQL Server,您可以使用 @@ROWCOUNT 来获取受最后一条语句影响的行数,并将实际查询的输出重定向到临时表或表变量,这样您就可以完全返回所有内容,并且不需要交易:
DECLARE @dummy INT
SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'
SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
回答by BoltBait
If you are really concerned that your row count will change between the select count and the select statement, why not select your rows into a temp table first? That way, you know you will be in sync.
如果您真的担心您的行数会在 select 计数和 select 语句之间发生变化,为什么不先将行选择到临时表中呢?这样,您就知道您将保持同步。
回答by Robert Gamble
Here are some ideas:
这里有一些想法:
- Go with Approach #1 and resize the array to hold additional results or use a type that automatically resizes as neccessary (you don't mention what language you are using so I can't be more specific).
- You could execute both statements in Approach #1 within a transaction to guarantee the counts are the same both times if your database supports this.
- I'm not sure what you are doing with the data but if it is possible to process the results without storing all of them first this might be the best method.
- 使用方法 #1 并调整数组大小以保存其他结果或使用一种根据需要自动调整大小的类型(您没有提到您使用的是哪种语言,所以我不能更具体)。
- 如果您的数据库支持,您可以在事务中执行方法 #1 中的两个语句,以确保两次计数相同。
- 我不确定您对数据做了什么,但是如果可以在不首先存储所有结果的情况下处理结果,这可能是最好的方法。
回答by dkretz
You might want to think about a better pattern for dealing with data of this type.
您可能想考虑一种更好的模式来处理这种类型的数据。
No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)
没有自尊的 SQL 驱动程序会在返回行之前告诉您查询将返回多少行,因为答案可能会改变(除非您使用事务,这会产生其自身的问题。)
The number of rows won't change - google for ACID and SQL.
行数不会改变——谷歌搜索 ACID 和 SQL。
回答by jonnii
Why don't you put your results into a vector? That way you don't have to know the size before hand.
你为什么不把你的结果放到一个向量中?这样你就不必事先知道尺寸。
回答by Deepfreezed
IF (@@ROWCOUNT > 0)
BEGIN
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
END
回答by Tschallacka
Just to add this because this is the top result in google for this question. In sqlite I used this to get the rowcount.
只是添加这个,因为这是谷歌这个问题的最高结果。在 sqlite 中,我用它来获取行数。
WITH temptable AS
(SELECT one,two
FROM
(SELECT one, two
FROM table3
WHERE dimension=0
UNION ALL SELECT one, two
FROM table2
WHERE dimension=0
UNION ALL SELECT one, two
FROM table1
WHERE dimension=0)
ORDER BY date DESC)
SELECT *
FROM temptable
LEFT JOIN
(SELECT count(*)/7 AS cnt,
0 AS bonus
FROM temptable) counter
WHERE 0 = counter.bonus