SQL 如何在HIVE中的“拥有”中使用和“在”子句?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38961921/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:47:21  来源:igfitidea点击:

How to use and "in" clause in "having" in HIVE?

sqlsqlitehadoophive

提问by Hunle

I have my data in sometable like this:

我的数据是这样的:

col1    col2    col3   
A       B       3
A       B       1
A       B       2
C       B       1

And I want to get all of the unique groups of col1and col2that contain certain rows of col3. Like, all groups of col1 and col2 that contain a "2".

而且我想获得包含 col3 某些行的col1和 的所有唯一组col2。就像,所有包含“2”的 col1 和 col2 组。

I wanted to do something like this:

我想做这样的事情:

select col1, col2 from sometable 
group by col1, col2
having col3=1 and col3=2

But I want it to only return groups that have an instance of both 1 and 2 in col3. so, the result after the query should return this:

但我希望它只返回在 col3 中同时具有 1 和 2 实例的组。因此,查询后的结果应返回:

   col1    col2
   A       B

How do I express this in HIVE? THANK YOU.

我如何在 HIVE 中表达这一点?谢谢你。

回答by Matt

I don't know why others deleted answers that where correct and then almost correct but I will put their's back up.

我不知道为什么其他人删除了正确且几乎正确的答案,但我会将他们的答案放回去。

SELECT col1, col2, COUNT(DISTINCT col3)
FROM
    sometable
WHERE
    col3 IN (1,2)
GROUP BY col1, col2
HAVING
    COUNT(DISTINCT col3) > 1

If you actually want to return all of the records that meet your criteria you need to do a sub select and join back to the main table to get them.

如果您真的想返回所有符合您的条件的记录,您需要进行子选择并连接回主表以获取它们。

SELECT s.*
FROM
    sometable s
    INNER JOIN (
       SELECT col1, col2, COUNT(DISTINCT col3)
       FROM
          sometable
       WHERE
          col3 IN (1,2)
       GROUP BY col1, col2
       HAVING
          COUNT(DISTINCT col3) > 1
    ) t
    ON s.Col1 = t.Col1
    AND s.Col2 = t.Col2
    AND s.col3 IN (1,2)

The gist of this is narrow/filter your rowset to the rows that you want to test col3 IN (1,2) then count the DISTINCTvalues of col3 to make sure both 1 and 2 exist and not just 1 & 1 or 2 & 2.

其要点是将行集缩小/过滤到要测试 col3 IN (1,2) 的行,然后计算DISTINCTcol3的值以确保 1 和 2 都存在,而不仅仅是 1 & 1 或 2 & 2。

回答by user7751206

I think below mentioned query will be useful for your question.

我认为下面提到的查询对您的问题很有用。

select col1,col2
from Abc
group by col1,col2
having count(col1) >1 AND COUNT(COL2)>2