SQL 如何在SQL中随机选择行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/580639/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 01:12:59  来源:igfitidea点击:

How to randomly select rows in SQL?

sqldatabaserandom

提问by Prashant

I am using MSSQL Server 2005. In my db, I have a table "customerNames" which has two columns "Id" and "Name" and approx. 1,000 results.

我正在使用 MSSQL Server 2005。在我的数据库中,我有一个表“customerNames”,它有两列“Id”和“Name”以及大约。1,000 个结果。

I am creating a functionality where I have to pick 5 customers randomly every time. Can anyone tell me how to create a query which will get random 5 rows (Id, and Name) every time when query is executed?

我正在创建一个功能,每次必须随机挑选 5 个客户。谁能告诉我如何创建一个查询,每次执行查询时都会随机获得 5 行(Id 和 Name)?

回答by Curtis Tasker

SELECT TOP 5 Id, Name FROM customerNames
ORDER BY NEWID()

That said, everybody seems to come to this page for the more general answer to your question:

也就是说,每个人似乎都来到此页面以获得对您的问题的更一般性的回答:

Selecting a random row in SQL

在 SQL 中选择随机行

Select a random row with MySQL:

使用 MySQL 随机选择一行:

SELECT column FROM table
ORDER BY RAND()
LIMIT 1

Select a random row with PostgreSQL:

使用 PostgreSQL 随机选择一行:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

Select a random row with Microsoft SQL Server:

使用 Microsoft SQL Server 随机选择一行:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

Select a random row with IBM DB2

使用 IBM DB2 随机选择一行

SELECT column, RAND() as IDX 
FROM table 
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Select a random record with Oracle:

使用 Oracle 随机选择一条记录:

SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1

Select a random row with sqlite:

使用 sqlite 随机选择一行:

SELECT column FROM table 
ORDER BY RANDOM() LIMIT 1

回答by Cody Caughlan

SELECT TOP 5 Id, Name FROM customerNames ORDER BY NEWID()

回答by Barry Brown

In case someone wants a PostgreSQL solution:

如果有人想要 PostgreSQL 解决方案:

select id, name
from customer
order by random()
limit 5;

回答by Barry Brown

Maybe this sitewill be of assistance.

也许这个网站会有所帮助。

For those who don't want to click through:

对于那些不想点击的人:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

回答by JohnC

There is a nice Microsoft SQL Server 2005 specific solution here. Deals with the problem where you are working with a large result set (not the question I know).

这里有一个很好的 Microsoft SQL Server 2005 特定解决方案。处理您处理大型结果集的问题(不是我知道的问题)。

Selecting Rows Randomly from a Large Table http://msdn.microsoft.com/en-us/library/cc441928.aspx

从大表中随机选择行 http://msdn.microsoft.com/en-us/library/cc441928.aspx

回答by Tohid

If you have a table with millions of rows and care about the performance, this could be a better answer:

如果您有一个包含数百万行的表并关心性能,这可能是一个更好的答案:

SELECT * FROM Table1
WHERE (ABS(CAST(
  (BINARY_CHECKSUM
  (keycol1, NEWID())) as int))
  % 100) < 10

https://msdn.microsoft.com/en-us/library/cc441928.aspx

https://msdn.microsoft.com/en-us/library/cc441928.aspx

回答by RIanGillis

This is an old question, but attempting to apply a new field (either NEWID() or ORDER BY rand()) to a table with a large number of rows would be prohibitively expensive. If you have incremental, unique IDs (and do not have any holes) it will be more efficient to calculate the X # of IDs to be selected instead of applying a GUID or similar to every single row and then taking the top X # of.

这是一个老问题,但尝试将新字段(NEWID() 或 ORDER BY rand())应用于具有大量行的表将非常昂贵。如果您有增量的、唯一的 ID(并且没有任何漏洞),那么计算要选择的 ID 的 X # 而不是将 GUID 或类似的应用到每一行然后取顶部的 X # 会更有效。

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];

DECLARE @randomId1 int, @randomId2 int, @randomId3 int, @randomId4 int, @randomId5 int
SET @randomId1 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId2 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId3 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId4 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId5 = ((@maxValue + 1) - @minValue) * Rand() + @minValue

--select @maxValue as MaxValue, @minValue as MinValue
--  , @randomId1 as SelectedId1
--  , @randomId2 as SelectedId2
--  , @randomId3 as SelectedId3
--  , @randomId4 as SelectedId4
--  , @randomId5 as SelectedId5

select * from [TABLE] el
where el.id in (@randomId1, @randomId2, @randomId3, @randomId4, @randomId5)

If you wanted to select many more rows I would look into populating a #tempTable with an ID and a bunch of rand() values then using each rand() value to scale to the min-max values. That way you do not have to define all of the @randomId1...n parameters. I've included an example below using a CTE to populate the initial table.

如果您想选择更多行,我会考虑使用 ID 和一堆 rand() 值填充 #tempTable,然后使用每个 rand() 值缩放到最小-最大值。这样您就不必定义所有 @randomId1...n 参数。我在下面包含了一个使用 CTE 填充初始表的示例。

DECLARE @NumItems int = 100;

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];
DECLARE @range int = @maxValue+1 - @minValue;

with cte (n) as (
   select 1 union all
   select n+1 from cte
   where n < @NumItems
)
select cast( @range * rand(cast(newid() as varbinary(100))) + @minValue as int) tp
into #Nt
from cte;

select * from #Nt ntt
inner join [TABLE] i on i.id = ntt.tp;

drop table #Nt;

回答by Narendra

SELECT * FROM TABLENAME ORDER BY random() LIMIT 5; 

回答by Billy

I have found this to work best for big data.

我发现这最适合大数据。

SELECT TOP 1 Column_Name FROM dbo.Table TABLESAMPLE(1 PERCENT);

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT)is random but need to add the TOP nto get the correct sample size.

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT)是随机的,但需要添加TOP n以获得正确的样本量。

Using NEWID()is very slow on large tables.

NEWID()在大表上使用非常慢。

回答by Vlad Mihalcea

As I explained in this article, in order to shuffle the SQL result set, you need to use a database-specific function call.

正如我在本文中所解释的,为了对 SQL 结果集进行 shuffle,您需要使用特定于数据库的函数调用。

Note that sorting a large result set using a RANDOM function might turn out to be very slow, so make sure you do that on small result sets.

If you have to shuffle a large result set and limit it afterward, then it's better to use something like the Oracle SAMPLE(N)or the TABLESAMPLEin SQL Serveror PostgreSQLinstead of a random function in the ORDER BY clause.

请注意,使用 RANDOM 函数对大型结果集进行排序可能会非常慢,因此请确保对小型结果集进行排序。

如果你有洗牌大型结果集,并随后限制它,那么它是更好地使用类似甲骨文SAMPLE(N)TABLESAMPLESQL服务器PostgreSQL的,而不是在顺序随机函数BY子句。

So, assuming we have the following database table:

因此,假设我们有以下数据库表:

enter image description here

在此处输入图片说明

And the following rows in the songtable:

以及song表中的以下行:

| id | artist                          | title                              |
|----|---------------------------------|------------------------------------|
| 1  | Miyagi & Эндшпиль ft. Рем Дигга | I Got Love                         |
| 2  | HAIM                            | Don't Save Me (Cyril Hahn Remix)   |
| 3  | 2Pac ft. DMX                    | Rise Of A Champion (GalilHD Remix) |
| 4  | Ed Sheeran & Passenger          | No Diggity (Kygo Remix)            |
| 5  | JP Cooper ft. Mali-Koa          | All This Love                      |

Oracle

甲骨文

On Oracle, you need to use the DBMS_RANDOM.VALUEfunction, as illustrated by the following example:

在 Oracle 上,您需要使用该DBMS_RANDOM.VALUE函数,如下例所示:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY DBMS_RANDOM.VALUE

When running the aforementioned SQL query on Oracle, we are going to get the following result set:

在 Oracle 上运行上述 SQL 查询时,我们将获得以下结果集:

| song                                              |
|---------------------------------------------------|
| JP Cooper ft. Mali-Koa - All This Love            |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the DBMS_RANDOM.VALUEfunction call used by the ORDER BY clause.

请注意,由于DBMS_RANDOM.VALUEORDER BY 子句使用的函数调用,歌曲以随机顺序列出。

SQL Server

数据库服务器

On SQL Server, you need to use the NEWIDfunction, as illustrated by the following example:

在 SQL Server 上,您需要使用该NEWID函数,如下例所示:

SELECT
    CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY NEWID()

When running the aforementioned SQL query on SQL Server, we are going to get the following result set:

在 SQL Server 上运行上述 SQL 查询时,我们将获得以下结果集:

| song                                              |
|---------------------------------------------------|
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| JP Cooper ft. Mali-Koa - All This Love            |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |

Notice that the songs are being listed in random order, thanks to the NEWIDfunction call used by the ORDER BY clause.

请注意,由于NEWIDORDER BY 子句使用的函数调用,歌曲以随机顺序列出。

PostgreSQL

PostgreSQL

On PostgreSQL, you need to use the randomfunction, as illustrated by the following example:

在 PostgreSQL 上,您需要使用该random函数,如下例所示:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY random()

When running the aforementioned SQL query on PostgreSQL, we are going to get the following result set:

在 PostgreSQL 上运行上述 SQL 查询时,我们将获得以下结果集:

| song                                              |
|---------------------------------------------------|
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the randomfunction call used by the ORDER BY clause.

请注意,由于randomORDER BY 子句使用的函数调用,歌曲以随机顺序列出。

MySQL

MySQL

On MySQL, you need to use the RANDfunction, as illustrated by the following example:

在 MySQL 上,您需要使用该RAND函数,如下例所示:

SELECT
  CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY RAND()

When running the aforementioned SQL query on MySQL, we are going to get the following result set:

在 MySQL 上运行上述 SQL 查询时,我们将获得以下结果集:

| song                                              |
|---------------------------------------------------|
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |

Notice that the songs are being listed in random order, thanks to the RANDfunction call used by the ORDER BY clause.

请注意,由于RANDORDER BY 子句使用的函数调用,歌曲以随机顺序列出。