我有两个外部Hive表如下。我使用sqoop从oracle填充了其中的数据。
create external table transaction_usa
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_usa';
create external table transaction_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_canada';
现在我想合并上面两个表数据,就像它在一个外部配置单元表中一样,所有字段都与上面两个表相同,但有一个额外的列来标识哪个数据来自哪个表。具有附加列的新外部表 source_table
. 新的外部表如下所示。
create external table transaction_usa_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int,
source_table string
)
row format delimited
stored as textfile
location '/user/gds/bank_ds/tran_usa_canada';
我怎么做。?
3条答案
按热度按时间00jrzges1#
你知道吗
SELECT
从每张table上UNION ALL
对这些结果执行操作,最后将结果插入到第三个表中。下面是最终的配置单元查询:
希望这对你有帮助!!!
vmjh9lq92#
您可以使用hive的insert into子句
sc4hvdpw3#
你完全可以这样做
manual partitioning
也。然后在命令下面运行,