SQL Hive:通过查找组中的最大值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9923587/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 15:04:27  来源:igfitidea点击:

Hive: Finding max value in a group by

sqlhivemax

提问by TopCoder

I have a hive table something like this:

我有一个像这样的蜂巢表:

create external table test(
  test_id string,
  test_name string,
  description string,
  clicks int,
  last_referred_click_date string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE LOCATION  '{some_location}';

I need to find out total clicks for a test_id and the last click date(max date in that group of test_id)

我需要找出 test_id 的总点击次数和最后一次点击日期(该组 test_id 中的最大日期)

I am doing something like this

我正在做这样的事情

insert overwrite table test partition(weekending='{input_date}')
  select s.test_id,s.test_name,s.description,max(click_date),
    sum(t.click) as clicks
   group by s.test_id,s.test_name,s.description order by clicks desc; 

Does max() function works for strings? My click_date is of teh format'yyyy-mm-dd' and is a string data type? If not, what can I do here ? UDF ?

max() 函数是否适用于字符串?我的 click_date 是 'yyyy-mm-dd' 格式并且是字符串数据类型?如果没有,我可以在这里做什么?UDF ?

回答by Shahbaz Chishty

SELECT s.test_id,
       s.test_name,
       s.description,
       MAX(CAST(last_referred_click_date as DateTime)), 
       sum(t.clicks) as Total_Clicks
FROM test s
WHERE s.test_id=1
GROUP BY s.test_id,s.test_name,s.description 
ORDER BY clicks desc;