我现在有一个类似这样的表:
表1
ID|NAME|INFO_CODE_1|INFO_CODE_2|INFO_CODE_3|INFO_CODE_4|INFO_TEXT|DESCRIPTION
1 |Test|123 |254 |556 |867 |Test Text|Test Description
我试图将表1中的数据插入表2中,这样第一个表中的每条记录都会变成第二个表中的四条独立记录,如下所示:
表2
ID|NAME|INFO_CODE|INFO_TEXT|DESCRIPTION
1 |Test|123 |Test Text|Test Description
1 |Test|254 |Test Text|Test Description
1 |Test|556 |Test Text|Test Description
1 |Test|867 |Test Text|Test Description
表\u 1中有成千上万条记录,还有其他列的数据我们不关心,而info\u code \u 1到info\u code \u 4中的值都因记录而异(甚至可能为空)。我在考虑写这样的东西:
INSERT OVERWRITE TABLE example.TABLE_2
SELECT
A.ID AS ID,
A.NAME AS NAME,
A.INFO_CODE_1 AS INFO_CODE,
A.INFO_TEXT AS INFO_TEXT,
A.DESCRIPTION AS DESCRIPTION
FROM
example.TABLE_1 A
UNION ALL
SELECT
A.ID AS ID,
A.NAME AS NAME,
A.INFO_CODE_2 AS INFO_CODE,
A.INFO_TEXT AS INFO_TEXT,
A.DESCRIPTION AS DESCRIPTION
FROM
example.TABLE_1 A
UNION ALL
SELECT
A.ID AS ID,
A.NAME AS NAME,
A.INFO_CODE_3 AS INFO_CODE,
A.INFO_TEXT AS INFO_TEXT,
A.DESCRIPTION AS DESCRIPTION
FROM
example.TABLE_1 A
UNION ALL
SELECT
A.ID AS ID,
A.NAME AS NAME,
A.INFO_CODE_4 AS INFO_CODE,
A.INFO_TEXT AS INFO_TEXT,
A.DESCRIPTION AS DESCRIPTION
FROM
example.TABLE_1 A;
这样行吗?
如果是的话,有没有更好的方式写声明?
有没有办法确保具有相同id的所有记录都被依次插入?
提前感谢你们能提供的任何帮助。
1条答案
按热度按时间vsdwdz231#
另一个没有太多工会的办法是这样。它将需要1个mapreduce作业,而不是几个节省资源的作业。您可以将代码合并、拆分以生成数组并分解数组。