SQL 如何通过SQL选择唯一记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1641718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select unique records by SQL
提问by Yinan
When I perform "SELECT * FROM table" I got results like below:
当我执行“SELECT * FROM table”时,我得到如下结果:
1 item1 data1
2 item1 data2
3 item2 data3
4 item3 data4
As you can see, there are dup records from column2 (item1 are dupped). So how could I just get result like this:
如您所见,column2 中有重复记录(item1 被重复)。那么我怎么能得到这样的结果:
1 item1 data1
2 item2 data3
3 item3 data4
Only one record are returned from the duplicate, along with the rest of the unique records.
从重复项中只返回一条记录,以及其余的唯一记录。
回答by mjallday
With the distinct
keyword with single and multiple column names, you get distinct records:
使用distinct
具有单个和多个列名的关键字,您可以获得不同的记录:
SELECT DISTINCT column 1, column 2, ...
FROM table_name;
回答by rahul
If you only need to remove duplicates then use DISTINCT
. GROUP BY
should be used to apply aggregate operators to each group
如果您只需要删除重复项,请使用DISTINCT
. GROUP BY
应该用于将聚合运算符应用于每个组
回答by Dave Barker
It depends on which rown you want to return for each unique item. Your data seems to indicate the minimum data value so in this instance for SQL Server.
这取决于您要为每个唯一项目返回哪一行。您的数据似乎指示了 SQL Server 的最小数据值,因此在此实例中。
SELECT item, min(data)
FROM table
GROUP BY item
回答by isapir
There are 4 methods you can use:
您可以使用 4 种方法:
- DISTINCT
- GROUP BY
- Subquery
- Common Table Expression (CTE) with ROW_NUMBER()
- 清楚的
- 通过...分组
- 子查询
- 带有 ROW_NUMBER() 的公共表表达式 (CTE)
Consider the following sample TABLE
with test data:
考虑以下TABLE
带有测试数据的示例:
/** Create test table */
CREATE TEMPORARY TABLE dupes(word text, num int, id int);
/** Add test data with duplicates */
INSERT INTO dupes(word, num, id)
VALUES ('aaa', 100, 1)
,('bbb', 200, 2)
,('ccc', 300, 3)
,('bbb', 400, 4)
,('bbb', 200, 5) -- duplicate
,('ccc', 300, 6) -- duplicate
,('ddd', 400, 7)
,('bbb', 400, 8) -- duplicate
,('aaa', 100, 9) -- duplicate
,('ccc', 300, 10); -- duplicate
Option 1: SELECT DISTINCT
选项 1:选择 DISTINCT
This is the most simple and straight forward, but also the most limited way:
这是最简单直接,但也是最有限的方式:
SELECT DISTINCT word, num
FROM dupes
ORDER BY word, num;
/*
word|num|
----|---|
aaa |100|
bbb |200|
bbb |400|
ccc |300|
ddd |400|
*/
Option 2: GROUP BY
选项 2:分组依据
Grouping allows you to add aggregated data, like the min(id)
, max(id)
, count(*)
, etc:
分组允许您添加汇总数据,如min(id)
,max(id)
,count(*)
,等:
SELECT word, num, min(id), max(id), count(*)
FROM dupes
GROUP BY word, num
ORDER BY word, num;
/*
word|num|min|max|count|
----|---|---|---|-----|
aaa |100| 1| 9| 2|
bbb |200| 2| 5| 2|
bbb |400| 4| 8| 2|
ccc |300| 3| 10| 3|
ddd |400| 7| 7| 1|
*/
Option 3: Subquery
选项 3:子查询
Using a subquery, you can first identify the duplicate rows to ignore, and then filter them out in the outer query with the WHERE NOT IN (subquery)
construct:
使用子查询,您可以首先识别要忽略的重复行,然后使用以下WHERE NOT IN (subquery)
构造在外部查询中过滤掉它们:
/** Find the higher id values of duplicates, distinct only added for clarity */
SELECT distinct d2.id
FROM dupes d1
INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
WHERE d2.id > d1.id
/*
id|
--|
5|
6|
8|
9|
10|
*/
/** Use the previous query in a subquery to exclude the dupliates with higher id values */
SELECT *
FROM dupes
WHERE id NOT IN (
SELECT d2.id
FROM dupes d1
INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
WHERE d2.id > d1.id
)
ORDER BY word, num;
/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/
Option 4: Common Table Expression with ROW_NUMBER()
选项 4:使用 ROW_NUMBER() 的公共表表达式
In the Common Table Expression (CTE), select the ROW_NUMBER(), partitioned by the group column and ordered in the desired order. Then SELECT only the records that have ROW_NUMBER() = 1
:
在公共表表达式 (CTE) 中,选择 ROW_NUMBER(),按组列分区并按所需顺序排序。然后仅选择具有ROW_NUMBER() = 1
以下内容的记录:
WITH CTE AS (
SELECT *
,row_number() OVER(PARTITION BY word, num ORDER BY id) AS row_num
FROM dupes
)
SELECT word, num, id
FROM cte
WHERE row_num = 1
ORDER BY word, num;
/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/
回答by Ankit Kashyap
just use inner join because group by won't work with multiple columns saying not contained in either an aggregate function.
只需使用内部连接,因为 group by 不会与多个列一起使用,表示不包含在聚合函数中。
SELECT a.*
FROM yourtable a
INNER JOIN
(SELECT yourcolumn,
MIN(id) as id
FROM yourtable
GROUP BY yourcolumn
) AS b
ON a.yourcolumn= b.yourcolumn
AND a.id = b.id;
回答by John Hamelink
I find that if I can't use DISTINCT for any reason, then GROUP BY will work.
我发现如果我因任何原因不能使用 DISTINCT,那么 GROUP BY 将起作用。
回答by htafoya
To get all the columns in your result you need to place something as:
要获得结果中的所有列,您需要将一些内容放置为:
SELECT distinct a, Table.* FROM Table
it will place aas the first column and the rest will be ALL of the columns in the same order as your definition. This is, column awill be repeated.
它会将a作为第一列,其余的将是所有列,其顺序与您的定义相同。也就是说,将重复列a。
回答by Shailendra Singhai
Select Eff_st from ( select EFF_ST,ROW_NUMBER() over(PARTITION BY eff_st) XYZ - from ABC.CODE_DIM
选择 Eff_st from ( select EFF_ST,ROW_NUMBER() over(PARTITION BY eff_st) XYZ - from ABC.CODE_DIM
) where XYZ= 1 order by EFF_ST fetch first 5 row only
) 其中 XYZ= 1 order by EFF_ST 仅获取前 5 行