SQL 使用 UNION ALL 在 Hive 中组合许多表?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16181684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Combine many tables in Hive using UNION ALL?
提问by baha-kev
I'm trying to append one variable from several tables together (aka row-bind, concatenate) to make one longer table with a single column in Hive. I think this is possible using UNION ALL
based on this question ( HiveQL UNION ALL), but I'm not sure an efficient way to accomplish this?
我正在尝试将多个表中的一个变量附加在一起(又名行绑定、连接),以在 Hive 中创建一个包含单个列的更长的表。我认为这是可能的,UNION ALL
基于这个问题(HiveQL UNION ALL),但我不确定一种有效的方法来实现这一点?
The pseudocode would look something like this:
伪代码如下所示:
CREATE TABLE tmp_combined AS
SELECT b.var1 FROM tmp_table1 b
UNION ALL
SELECT c.var1 FROM tmp_table2 c
UNION ALL
SELECT d.var1 FROM tmp_table3 d
UNION ALL
SELECT e.var1 FROM tmp_table4 e
UNION ALL
SELECT f.var1 FROM tmp_table5 f
UNION ALL
SELECT g.var1 FROM tmp_table6 g
UNION ALL
SELECT h.var1 FROM tmp_table7 h;
Any help is appreciated!
任何帮助表示赞赏!
回答by Marimuthu Kandasamy
Try with following coding...
尝试使用以下编码...
Select * into tmp_combined from
(
SELECT b.var1 FROM tmp_table1 b
UNION ALL
SELECT c.var1 FROM tmp_table2 c
UNION ALL
SELECT d.var1 FROM tmp_table3 d
UNION ALL
SELECT e.var1 FROM tmp_table4 e
UNION ALL
SELECT f.var1 FROM tmp_table5 f
UNION ALL
SELECT g.var1 FROM tmp_table6 g
UNION ALL
SELECT h.var1 FROM tmp_table7 h
) CombinedTable
Use with the statement : set hive.exec.parallel=true
与语句一起使用:set hive.exec.parallel=true
This will execute different selects simultaneously otherwise it would be step by step.
这将同时执行不同的选择,否则将逐步执行。
回答by Haoyan
I would say that's both straightforward and efficient way to do the row-bind, at least, that's what I would use in my code. Btw, it might cause you some syntax error if you put your pseudo code directly, you may try:
我会说这是进行行绑定的既简单又有效的方法,至少,这就是我将在我的代码中使用的方法。顺便说一句,如果您直接放置伪代码,可能会导致一些语法错误,您可以尝试:
create table join_table as
select * from
(select ...
join all
select
join all
select...) tmp;
create table join_table as
select * from
(select ...
join all
select
join all
select...) tmp;
回答by ChikuMiku
I did same concept but for different tables employee
and location
that might help you I believe :
我做了同样的概念,但对于不同的表employee
,location
我相信这可能对你有帮助:
DATA:Table_e-employee
empid empname
13 Josan
8 Alex
3 Ram
17 Babu
25 John
Table_l-location
empid emplocation
13 San Jose
8 Los Angeles
3 Pune,IN
17 Chennai,IN
39 Banglore,IN
hive> SELECT e.empid AS a ,e.empname AS b FROM employee e
UNION ALL
SELECT l.empid AS a,l.emplocation AS b FROM location l;
OutPut with alias a
and b
:
带有别名a
和的输出 b
:
13 San Jose
8 Los Angeles
3 Pune,IN
17 Chennai,IN
39 Banglore,IN
13 Josan
8 Alex
3 Ram
17 Babu
25 John