SQL 如何通过SQL选择唯一记录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1641718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:11:52  来源:igfitidea点击:

How to select unique records by SQL

sqldatabase

提问by Yinan

When I perform "SELECT * FROM table" I got results like below:

当我执行“SELECT * FROM table”时,我得到如下结果:

1 item1 data1
2 item1 data2
3 item2 data3
4 item3 data4

As you can see, there are dup records from column2 (item1 are dupped). So how could I just get result like this:

如您所见,column2 中有重复记录(item1 被重复)。那么我怎么能得到这样的结果:

1 item1 data1
2 item2 data3
3 item3 data4

Only one record are returned from the duplicate, along with the rest of the unique records.

从重复项中只返回一条记录,以及其余的唯一记录。

回答by mjallday

With the distinctkeyword with single and multiple column names, you get distinct records:

使用distinct具有单个和多个列名的关键字,您可以获得不同的记录:

SELECT DISTINCT column 1, column 2, ...
FROM table_name;

回答by rahul

If you only need to remove duplicates then use DISTINCT. GROUP BYshould be used to apply aggregate operators to each group

如果您只需要删除重复项,请使用DISTINCT. GROUP BY应该用于将聚合运算符应用于每个组

GROUP BY v DISTINCT

GROUP BY v DISTINCT

回答by Dave Barker

It depends on which rown you want to return for each unique item. Your data seems to indicate the minimum data value so in this instance for SQL Server.

这取决于您要为每个唯一项目返回哪一行。您的数据似乎指示了 SQL Server 的最小数据值,因此在此实例中。

SELECT item, min(data)
FROM  table
GROUP BY item

回答by isapir

There are 4 methods you can use:

您可以使用 4 种方法:

  1. DISTINCT
  2. GROUP BY
  3. Subquery
  4. Common Table Expression (CTE) with ROW_NUMBER()
  1. 清楚的
  2. 通过...分组
  3. 子查询
  4. 带有 ROW_NUMBER() 的公共表表达式 (CTE)

Consider the following sample TABLEwith test data:

考虑以下TABLE带有测试数据的示例:

/** Create test table */
CREATE TEMPORARY TABLE dupes(word text, num int, id int);

/** Add test data with duplicates */
INSERT INTO dupes(word, num, id)
VALUES ('aaa', 100, 1)
      ,('bbb', 200, 2)
      ,('ccc', 300, 3)
      ,('bbb', 400, 4)
      ,('bbb', 200, 5)     -- duplicate
      ,('ccc', 300, 6)     -- duplicate
      ,('ddd', 400, 7)
      ,('bbb', 400, 8)     -- duplicate
      ,('aaa', 100, 9)     -- duplicate
      ,('ccc', 300, 10);   -- duplicate

Option 1: SELECT DISTINCT

选项 1:选择 DISTINCT

This is the most simple and straight forward, but also the most limited way:

这是最简单直接,但也是最有限的方式:

SELECT DISTINCT word, num 
FROM    dupes
ORDER BY word, num;

/*
word|num|
----|---|
aaa |100|
bbb |200|
bbb |400|
ccc |300|
ddd |400|
*/

Option 2: GROUP BY

选项 2:分组依据

Grouping allows you to add aggregated data, like the min(id), max(id), count(*), etc:

分组允许您添加汇总数据,如min(id)max(id)count(*),等:

SELECT  word, num, min(id), max(id), count(*)
FROM    dupes
GROUP BY word, num
ORDER BY word, num;

/*
word|num|min|max|count|
----|---|---|---|-----|
aaa |100|  1|  9|    2|
bbb |200|  2|  5|    2|
bbb |400|  4|  8|    2|
ccc |300|  3| 10|    3|
ddd |400|  7|  7|    1|
*/

Option 3: Subquery

选项 3:子查询

Using a subquery, you can first identify the duplicate rows to ignore, and then filter them out in the outer query with the WHERE NOT IN (subquery)construct:

使用子查询,您可以首先识别要忽略的重复行,然后使用以下WHERE NOT IN (subquery)构造在外部查询中过滤掉它们:

/** Find the higher id values of duplicates, distinct only added for clarity */
    SELECT  distinct d2.id
    FROM    dupes d1
        INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
    WHERE d2.id > d1.id

/*
id|
--|
 5|
 6|
 8|
 9|
10|
*/

/** Use the previous query in a subquery to exclude the dupliates with higher id values */
SELECT  *
FROM    dupes
WHERE   id NOT IN (
    SELECT  d2.id
    FROM    dupes d1
        INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
    WHERE d2.id > d1.id
)
ORDER BY word, num;

/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/

Option 4: Common Table Expression with ROW_NUMBER()

选项 4:使用 ROW_NUMBER() 的公共表表达式

In the Common Table Expression (CTE), select the ROW_NUMBER(), partitioned by the group column and ordered in the desired order. Then SELECT only the records that have ROW_NUMBER() = 1:

在公共表表达式 (CTE) 中,选择 ROW_NUMBER(),按组列分区并按所需顺序排序。然后仅选择具有ROW_NUMBER() = 1以下内容的记录:

WITH CTE AS (
    SELECT  *
           ,row_number() OVER(PARTITION BY word, num ORDER BY id) AS row_num
    FROM    dupes
)
SELECT  word, num, id 
FROM    cte
WHERE   row_num = 1
ORDER BY word, num;

/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/

回答by Ankit Kashyap

just use inner join because group by won't work with multiple columns saying not contained in either an aggregate function.

只需使用内部连接,因为 group by 不会与多个列一起使用,表示不包含在聚合函数中。

SELECT a.*
FROM yourtable a
INNER JOIN 
  (SELECT yourcolumn,
    MIN(id) as id
  FROM yourtable 
  GROUP BY yourcolumn
) AS b
  ON a.yourcolumn= b.yourcolumn
  AND a.id = b.id;

回答by John Hamelink

I find that if I can't use DISTINCT for any reason, then GROUP BY will work.

我发现如果我因任何原因不能使用 DISTINCT,那么 GROUP BY 将起作用。

回答by htafoya

To get all the columns in your result you need to place something as:

要获得结果中的所有列,您需要将一些内容放置为:

SELECT distinct a, Table.* FROM Table

it will place aas the first column and the rest will be ALL of the columns in the same order as your definition. This is, column awill be repeated.

它会将a作为第一列,其余的将是所有列,其顺序与您的定义相同。也就是说,将重复列a

回答by Shailendra Singhai

Select Eff_st from ( select EFF_ST,ROW_NUMBER() over(PARTITION BY eff_st) XYZ - from ABC.CODE_DIM

选择 Eff_st from ( select EFF_ST,ROW_NUMBER() over(PARTITION BY eff_st) XYZ - from ABC.CODE_DIM

) where XYZ= 1 order by EFF_ST fetch first 5 row only

) 其中 XYZ= 1 order by EFF_ST 仅获取前 5 行