在带有附加列的新外部表中插入2个配置单元外部表的数据

zmeyuzjn 于 2021-05-29 发布在 Hadoop

关注(0)|答案(3)|浏览(373)

我有两个外部Hive表如下。我使用sqoop从oracle填充了其中的数据。

create external table transaction_usa
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_usa';

create external table transaction_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_canada';

现在我想合并上面两个表数据，就像它在一个外部配置单元表中一样，所有字段都与上面两个表相同，但有一个额外的列来标识哪个数据来自哪个表。具有附加列的新外部表 source_table . 新的外部表如下所示。

create external table transaction_usa_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int,
source_table string
)
row format delimited
stored as textfile
location '/user/gds/bank_ds/tran_usa_canada';

我怎么做。？

hadoop Hive hdfs external-tables

来源：https://stackoverflow.com/questions/37299136/insert-data-of-2-hive-external-tables-in-new-external-table-with-additional-colu

3条答案

按热度按时间

00jrzges1#

你知道吗 SELECT 从每张table上 UNION ALL 对这些结果执行操作，最后将结果插入到第三个表中。
下面是最终的配置单元查询：

INSERT INTO TABLE transaction_usa_canada
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_usa' AS source_table FROM transaction_usa
UNION ALL
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_canada' AS source_table FROM transaction_canada;

希望这对你有帮助！！！

赞(0）回复(0）举报 2021-05-30

vmjh9lq92#

您可以使用hive的insert into子句

INSERT INTO TABLE table transaction_usa_canada 
SELECT tran_id, acct_id, tran_date, ...'transaction_usa' FROM transaction_usa;

INSERT INTO TABLE table transaction_usa_canada 
SELECT tran_id, acct_id, tran_date, ...'transaction_canada' FROM transaction_canada;

赞(0）回复(0）举报 2021-05-29

sc4hvdpw3#

你完全可以这样做 manual partitioning 也。

CREATE TABLE transaction_new_table (
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
PARTITIONED BY (sourcetablename String)

然后在命令下面运行，

load data inpath 'hdfspath' into table transaction_new_table   partition(sourcetablename='1')

赞(0）回复(0）举报 2021-05-29

我来回答

在带有附加列的新外部表中插入2个配置单元外部表的数据

3条答案

相关问题

热门标签

最新问答