database 向 hive 表中插入数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16980773/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 08:58:49  来源:igfitidea点击:

Insert data into hive table

databasehadoophive

提问by Apaachee

Using a Cygwin distribution, I've installed Hadoop 0.20.3 and Hive 0.11.0.

使用 Cygwin 发行版,我安装了 Hadoop 0.20.3 和 Hive 0.11.0。

First of all, I don't understand how to use the Hive CLI:

首先,我不明白如何使用 Hive CLI:

hive> show tables;

Then enter and nothing happens. I can execute queries using hive -e/-f.

然后输入,什么也没有发生。我可以使用执行查询hive -e/-f.

Then, I've created a table:

然后,我创建了一个表:

CREATE TABLE tweet_table(
tweet STRING
)
COMMENT 'Table of string'

But how can I insert data into this table? I see some INSERT INTOexamples but when I try:

但是如何将数据插入到这个表中呢?我看到了一些INSERT INTO例子,但是当我尝试时:

INSERT INTO TABLE tweet_table (tweet) VALUES ("data")

I've got an error:

我有一个错误:

FAILED: ParseException line 1:30 cannot recognize input near '(' 'tweet' ')' in select clause

How can I append data in my table?

如何在表中附加数据?

回答by Balaswamy Vaddeman

You can insert new data into table by two ways.

您可以通过两种方式向表中插入新数据。

  1. Load the data of a file into table using loadcommand.

    LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename.
    
  2. You can insert new data into table by using selectquery.

    INSERT INTO table tablename1 select columnlist FROM secondtable;
    
  1. 使用load命令将文件的数据加载到表中。

    LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename.
    
  2. 您可以使用选择查询将新数据插入表中。

    INSERT INTO table tablename1 select columnlist FROM secondtable;
    

回答by Animesh Raj Jha

If you already have a table pre_loaded_tbl with some data. You can use a trick to load the data into your table with following query

如果您已经有一个包含一些数据的表 pre_loaded_tbl。您可以通过以下查询使用技巧将数据加载到表中

INSERT INTO TABLE tweet_table 
  SELECT  "my_data" AS my_column 
    FROM   pre_loaded_tbl 
   LIMIT   5;

Also please note that "my_data" is independent of any data in the pre_loaded_tbl. You can select any data and write any column name (here my_data and my_column). Hive does not require it to have same column name. However structure of select statement should be same as that of your tweet_table. You can use limit to determine how many times you can insert into the tweet_table.

另请注意,“my_data”独立于 pre_loaded_tbl 中的任何数据。您可以选择任何数据并写入任何列名(此处为 my_data 和 my_column)。Hive 不要求它具有相同的列名。但是,select 语句的结构应该与您的 tweet_table 的结构相同。您可以使用 limit 来确定可以插入到 tweet_table 中的次数。

However if you haven't' created any table, you will have to load the data using file copy or load data commands in above answers.

但是,如果您尚未创建任何表,则必须使用上述答案中的文件复制或加载数据命令加载数据。

回答by Puneet Sharma

Try to use this with single quotes in data:

尝试将其与数据中的单引号一起使用:

insert into table test_hive values ('1','puneet');

回答by Dhrub Thakur

If table is without partition then code will be,

如果表没有分区,那么代码将是,

Insert into table table_nameselect col_a,col_b,col_c from another_table(source table)

插入表table_nameselect col_a,col_b,col_c fromanother_table(source table)

--here any condition can be applied such as limit, group by, order by etc...

--这里可以应用任何条件,例如限制、分组依据、排序依据等...

If table is with partitions then code will be,

如果表带有分区,那么代码将是,

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

设置 hive.exec.dynamic.partition=true;
设置 hive.exec.dynamic.partition.mode=nonstrict;

insert into table table_namepartition(partition_col1, paritition_col2) select col_a,col_b,col_c,partition_col1,partition_col2 from another_table(source table)

insert into table table_namepartition( partition_col1, paritition_col2) select col_a,col_b,col_c,partition_col1,partition_col2 fromanother_table(source table)

--here any condition can be applied such as limit, group by, order by etc...

--这里可以应用任何条件,例如限制、分组依据、排序依据等...

回答by Imdadul Choudhury

Although there is an accepted answer I would want to add that as of Hive 0.14, record level operations are allowed. The correct syntax and query would be:

尽管有一个公认的答案,但我想补充一点,从 Hive 0.14 开始,允许记录级别的操作。正确的语法和查询是:

INSERT INTO TABLE tweet_table VALUES ('data');

回答by kuldeep singh

I faced similar problem so I used impala to insert the single row into my table and it worked for the same query. This will work for sure.

我遇到了类似的问题,所以我使用 impala 将单行插入到我的表中,它适用于相同的查询。这肯定会起作用。