postgresql 为每组选择随机行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16044828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select random row for each group
提问by speendo
I have a table like this
我有一张这样的桌子
ID ATTRIBUTE
1 A
1 A
1 B
1 C
2 B
2 C
2 C
3 A
3 B
3 C
I'd like to select just onerandom attribute for each ID. The result therefore could look like this (although this is just one of many options
我只想为每个 ID选择一个随机属性。因此,结果可能如下所示(尽管这只是众多选择之一
ATTRIBUTE
B
C
C
This is my attempt on this problem
这是我对这个问题的尝试
SELECT
"ATTRIBUTE"
FROM
(
SELECT
"ID",
"ATTRIBUTE",
row_number() OVER (PARTITION BY "ID" ORDER BY random()) rownum
FROM
table
) shuffled
WHERE
rownum = 1
however, I don't know if this is a good solution, as I need to introduce row numbers, which is a bit cumbersome.
但是,我不知道这是否是一个好的解决方案,因为我需要引入行号,这有点麻烦。
Do you have a better one?
你有更好的吗?
回答by Clodoaldo Neto
select distinct on (id) id, attribute
from like_this
order by id, random()
If you only need the attribute column:
如果您只需要属性列:
select distinct on (id) attribute
from like_this
order by id, random()
Notice that you still need to order by id
first as it is a column of the distinct on
.
请注意,您仍然需要先订购,id
因为它是distinct on
.
If you only want the distinct attributes:
如果您只想要不同的属性:
select distinct attribute
from (
select distinct on (id) attribute
from like_this
order by id, random()
) s
回答by Turdus
Put a big random number in front of each record (id) and choose within each group the record with the lowest random number.
在每条记录(id)前面放一个大的随机数,并在每组中选择随机数最小的记录。
$ cat test.txt
\N 1 a
\N 2 b
\N 2 c
\N 2 d
\N 3 e
\N 4 f
$ mysql
USE test;
DROP TABLE test;
CREATE TABLE test (id0 INT NOT NULL AUTO_INCREMENT, id VARCHAR(1), attribute VARCHAR(1), PRIMARY KEY (id0));
LOAD DATA LOCAL INFILE '~/mysql/test.txt' INTO TABLE test FIELDS TERMINATED BY '\t';
DROP TABLE rtest;
CREATE TABLE rtest (random INT(8), id0 VARCHAR(1), id VARCHAR(1), attribute VARCHAR(1), PRIMARY KEY (id, random));
INSERT INTO rtest
SELECT CAST(1000000. * rand() AS INT) AS random, test.* FROM test;
SELECT rtest.* FROM rtest,
(SELECT id, min(random) AS random FROM rtest GROUP BY id) AS sample WHERE rtest.random=sample.random AND rtest.id=sample.id;