SQL 如何在HIVE中的“拥有”中使用和“在”子句?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38961921/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use and "in" clause in "having" in HIVE?
提问by Hunle
I have my data in sometable like this:
我的数据是这样的:
col1 col2 col3
A B 3
A B 1
A B 2
C B 1
And I want to get all of the unique groups of col1
and col2
that contain certain rows of col3. Like, all groups of col1 and col2 that contain a "2".
而且我想获得包含 col3 某些行的col1
和 的所有唯一组col2
。就像,所有包含“2”的 col1 和 col2 组。
I wanted to do something like this:
我想做这样的事情:
select col1, col2 from sometable
group by col1, col2
having col3=1 and col3=2
But I want it to only return groups that have an instance of both 1 and 2 in col3. so, the result after the query should return this:
但我希望它只返回在 col3 中同时具有 1 和 2 实例的组。因此,查询后的结果应返回:
col1 col2
A B
How do I express this in HIVE? THANK YOU.
我如何在 HIVE 中表达这一点?谢谢你。
回答by Matt
I don't know why others deleted answers that where correct and then almost correct but I will put their's back up.
我不知道为什么其他人删除了正确且几乎正确的答案,但我会将他们的答案放回去。
SELECT col1, col2, COUNT(DISTINCT col3)
FROM
sometable
WHERE
col3 IN (1,2)
GROUP BY col1, col2
HAVING
COUNT(DISTINCT col3) > 1
If you actually want to return all of the records that meet your criteria you need to do a sub select and join back to the main table to get them.
如果您真的想返回所有符合您的条件的记录,您需要进行子选择并连接回主表以获取它们。
SELECT s.*
FROM
sometable s
INNER JOIN (
SELECT col1, col2, COUNT(DISTINCT col3)
FROM
sometable
WHERE
col3 IN (1,2)
GROUP BY col1, col2
HAVING
COUNT(DISTINCT col3) > 1
) t
ON s.Col1 = t.Col1
AND s.Col2 = t.Col2
AND s.col3 IN (1,2)
The gist of this is narrow/filter your rowset to the rows that you want to test col3 IN (1,2) then count the DISTINCT
values of col3 to make sure both 1 and 2 exist and not just 1 & 1 or 2 & 2.
其要点是将行集缩小/过滤到要测试 col3 IN (1,2) 的行,然后计算DISTINCT
col3的值以确保 1 和 2 都存在,而不仅仅是 1 & 1 或 2 & 2。
回答by user7751206
I think below mentioned query will be useful for your question.
我认为下面提到的查询对您的问题很有用。
select col1,col2
from Abc
group by col1,col2
having count(col1) >1 AND COUNT(COL2)>2