SQL Hive DISTINCT() 用于所有列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42794151/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Hive DISTINCT() for all columns?
提问by user2205916
I wanted clarification as to how DISTINCT works, using a toy example below. Suppose you had a table like so, with 2 columns and only 2 rows of data:
我想使用下面的玩具示例来说明 DISTINCT 的工作原理。假设你有一个像这样的表,有 2 列,只有 2 行数据:
SELECT *
FROM table1;
colA colB
A B
A C
Suppose I ran a SELECT DISTINCT query:
假设我运行了一个 SELECT DISTINCT 查询:
SELECT DISTINCT colA, colB
FROM table1;
Which of the following results would be returned for the query above?
上述查询将返回以下哪些结果?
Outcome possibility 1:
结果可能性1:
A B
A C
The thinking for this possibility is that while the values are not distinct on colA
, the entire returned row is unique, or distinct, when both columns are considered. I'm uncertain because of the effect of the comma between colA
and colB
in SELECT DISTINCT colA, colB
and whether it serves to limit the DISTINCT
to colA
.
这种可能性的想法是,虽然值在 上不colA
不同,但当考虑两列时,整个返回的行是唯一的或不同的。我不确定,因为colA
和colB
in之间的逗号的影响SELECT DISTINCT colA, colB
以及它是否用于限制DISTINCT
to colA
。
Outcome possibility 2:
结果可能性2:
A B
Outcome possibility 3:
结果可能性3:
A C
回答by Faraz
The DISTINCT apply on complete row.
DISTINCT 适用于完整的行。
A B
A C
The above two rows a different so the result will be like Outcome possibility 1 i.e.
以上两行不同,所以结果将类似于结果可能性 1,即
A B
A C
回答by Gordon Linoff
Perhaps this will help. This query:
也许这会有所帮助。这个查询:
SELECT DISTINCT colA, colB
FROM table1;
is functionally equivalent to:
在功能上等同于:
SELECT colA, colB
FROM table1
GROUP BY colA, colB;
It is going to return all pairs of colA
/colB
that appear in the data.
它将返回出现在数据中的所有colA
/对colB
。