SQL/mysql - 选择不同/唯一但返回所有列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6127338/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL/mysql - Select distinct/UNIQUE but return all columns?
提问by aryaxt
SELECT DISTINCT field1, field2, field3, ...... FROM table
I am trying to accomplish the following sql statement but I want it to return all columns is this possible? Something like:
我正在尝试完成以下 sql 语句,但我希望它返回所有列,这可能吗?就像是:
SELECT DISTINCT field1, * from table
回答by Denis de Bernardy
You're looking for a group by:
您正在通过以下方式寻找群组:
select *
from table
group by field1
Which can occasionally be written with a distinct on statement:
有时可以用不同的 on 语句来写:
select distinct on field1 *
from table
On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)
但是,在大多数平台上,上述两种方法都不起作用,因为其他列上的行为未指定。(第一个在 MySQL 中有效,如果这是您使用的。)
You could fetch the distinct fields and stick to picking a single arbitrary row each time.
您可以获取不同的字段并坚持每次选择单个任意行。
On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:
在某些平台(例如 PostgreSQL、Oracle、T-SQL)上,这可以直接使用窗口函数完成:
select *
from (
select *,
row_number() over (partition by field1 order by field2) as row_number
from table
) as rows
where row_number = 1
On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.
在其他人(MySQL、SQLite)上,您需要编写子查询,使您可以将整个表与其自身连接起来(例如),因此不推荐。
回答by Costi Ciudatu
From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT
nor GROUP BY
, because the result is not determined.
从您的问题的措辞来看,我了解到您希望为给定字段选择不同的值,并为每个此类值选择同一行中列出的所有其他列值。大多数 DBMS 将不允许使用 noneDISTINCT
或GROUP BY
,因为结果未确定。
Think of it like this: if your field1
occurs more than once, what value of field2
will be listed (given that you have the same value for field1
in two rows but two distinct values of field2
in those two rows).
可以这样想:如果您field1
出现不止一次,field2
将列出什么值(假设您field1
在两行中有相同的值,但在这两行中有两个不同的值field2
)。
You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY
instead of DISTINCT
:
但是,您可以使用聚合函数(明确针对您想要显示的每个字段)并使用 aGROUP BY
代替DISTINCT
:
SELECT field1, MAX(field2), COUNT(field3), SUM(field4), .... FROM table GROUP BY field1
回答by rocklandcitizen
If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.
如果我正确理解您的问题,它与我刚刚遇到的问题相似。您希望能够将 DISTINCT 的可用性限制到指定的字段,而不是将其应用于所有数据。
If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.
如果您在没有聚合函数的情况下使用 GROUP BY,那么您的 GROUP BY 字段将是您的 DISTINCT 字段。
If you make your query:
如果您进行查询:
SELECT * from table GROUP BY field1;
It will show all your results based on a single instance of field1.
它将显示基于 field1 的单个实例的所有结果。
For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:
例如,如果您有一个包含姓名、地址和城市的表格。一个人记录了多个地址,但是你只想要这个人的一个地址,可以如下查询:
SELECT * FROM persons GROUP BY name;
The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.
结果将是该名称的一个实例将与其地址一起出现,而另一个实例将从结果表中省略。注意:如果您的文件具有原子值,例如 firstName、lastName,则您希望按两者进行分组。
SELECT * FROM persons GROUP BY lastName, firstName;
because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.
因为如果两个人的姓氏相同,而您只按姓氏分组,则结果中将省略其中一个人。你需要考虑这些事情。希望这可以帮助。
回答by Stormy
SELECT c2.field1 ,
field2
FROM (SELECT DISTINCT
field1
FROM dbo.TABLE AS C
) AS c1
JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1
回答by Ulf Sanne
That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.
这真是个好问题。我已经在这里阅读了一些有用的答案,但也许我可以添加更精确的解释。
Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.
只要您不查询附加信息,使用 GROUP BY 语句减少查询结果的数量就很容易。假设您有下表“位置”。
--country-- --city--
France Lyon
Poland Krakow
France Paris
France Marseille
Italy Milano
Now the query
现在查询
SELECT country FROM locations
GROUP BY country
will result in:
将导致:
--country--
France
Poland
Italy
However, the following query
但是,以下查询
SELECT country, city FROM locations
GROUP BY country
...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?
...在 MS SQL 中引发错误,因为您的计算机如何知道您要在“法国”右侧的字段中读取三个法国城市“里昂”、“巴黎”或“马赛”中的哪一个?
In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.
为了更正第二个查询,您必须添加此信息。一种方法是使用函数 MAX() 或 MIN(),在所有候选中选择最大或最小值。MAX() 和 MIN() 不仅适用于数值,还可以比较字符串值的字母顺序。
SELECT country, MAX(city) FROM locations
GROUP BY country
will result in:
将导致:
--country-- --city--
France Paris
Poland Krakow
Italy Milano
or:
或者:
SELECT country, MIN(city) FROM locations
GROUP BY country
will result in:
将导致:
--country-- --city--
France Lyon
Poland Krakow
Italy Milano
These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.
只要您可以从字母(或数字)顺序的两端选择您的值,这些函数就是一个很好的解决方案。但如果情况并非如此呢?让我们假设您需要一个具有特定特征的值,例如以字母“M”开头。现在事情变得复杂了。
The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:
到目前为止,我能找到的唯一解决方案是将整个查询放入子查询中,并在其外部手动构造附加列:
SELECT
countrylist.*,
(SELECT TOP 1 city
FROM locations
WHERE
country = countrylist.country
AND city like 'M%'
)
FROM
(SELECT country FROM locations
GROUP BY country) countrylist
will result in:
将导致:
--country-- --city--
France Marseille
Poland NULL
Italy Milano
回答by Pedro Ramos
Try
尝试
SELECT table.* FROM table
WHERE otherField = 'otherValue'
GROUP BY table.fieldWantedToBeDistinct
limit x
回答by Garrett Simpson
Great question @aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!
好问题@aryaxt——你可以说这是一个好问题,因为你 5 年前问过它,今天我偶然发现了它试图找到答案!
I just tried to edit the accepted answer to include this, but in case my edit does not make it in:
我只是尝试编辑已接受的答案以包含此内容,但如果我的编辑没有包含在:
If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:
如果您的表不是那么大,并且假设您的主键是一个自动递增的整数,您可以执行以下操作:
SELECT
table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
SELECT field, MAX(id) as id
FROM table
GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
//this will result in only the last instance being seen
noDupes.id is not NULL
回答by user2225399
You can do it with a WITH
clause.
你可以用一个WITH
条款来做到这一点。
For example:
例如:
WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c
This also allows you to select only the rows selected in the WITH
clauses query.
这也允许您只选择在WITH
子句查询中选择的行。
回答by dotjoe
For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...
对于SQL Server,您可以使用dense_rank 和附加窗口函数来获取指定列上具有重复值的所有行和列。这是一个例子...
with t as (
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
select
*,
total_dr_rows = count(*) over(partition by dr)
from (
select
*,
dr = dense_rank() over(order by col1, col2, col3),
dr_rn = row_number() over(partition by col1, col2, col3 order by other)
from
t
) x
)
select * from tdr where total_dr_rows > 1
This is taking a row count for each distinct combination of col1, col2, and col3.
这是对 col1、col2 和 col3 的每个不同组合进行行计数。
回答by KadoJ
select min(table.id), table.column1
from table
group by table.column1