如何在配置单元中将多个列值转换为行?

hgb9j2n6  于 2021-05-27  发布在  Hadoop
关注(0)|答案(2)|浏览(348)
Input:

ID    COLUMN1     COLUMN2             COLUMN3
1     M,S,E,T     1,2,3,4             5,6,7
2     A,B,C       6,5,8,7,9,1         2,4,3,0,1

Output:

ID      COLUMN1      COLUMN2      COLUMN3
1        M           10           50
1        S           20           60
1        E           30           70
1        T           40           NULL
2        A           6            2
2        B           5            4
2        C           8            3 
2        NULL        7            0
2        NULL        9            1
2        NULL        1            NULL

代码:

select ID,
array_index( COLUMN1_arr, n ) as COLUMN1,
array_index( COLUMN2_arr, n ) as COLUMN2
from sample
lateral view numeric_range(size(COLUMN1_arr)) n1 as n;

错误:
失败:语义异常[错误10011]:无效的函数数组\u索引
在这里,我在一列中有多个值,我需要将它转换为前面提到的行输出。

6kkfgxo0

6kkfgxo01#

explode是hive中提供的一个udtf,您可以使用它将数据从列拆分为行。

SELECT ID1, col1,col2,col3
FROM tableName 
lateral view explode(split(COLUMN1,',')) cols1 AS col1
lateral view explode(split(COLUMN2,',')) cols2 AS col2
lateral view explode(split(COLUMN3,',')) cols3 AS col3
kqlmhetl

kqlmhetl2#

纯香草Hive解决方案,不含砖瓦自定义项。
演示:

with 

input as ( ---------------Input dataset
select stack(2,
1, array('M','S','E','T'), array(1,2,3,4),     array(5,6,7),
2, array('A','B','C'),     array(6,5,8,7,9,1), array( 2,4,3,0,1) 
) as (ID,COLUMN1,COLUMN2,COLUMN3)
),

--explode each array and FULL join them
c1 as (
select i.id, v.column1, p
  from input i
       lateral view posexplode(i.COLUMN1) v as p,column1
),

c2 as (
select i.id, v.column2, p
  from input i
       lateral view posexplode(i.COLUMN2) v as p,column2
),

c3 as (
select i.id, v.column3, p
  from input i
       lateral view posexplode(i.COLUMN3) v as p,column3
)

--FULL JOIN
select coalesce(c1.id,c2.id,c3.id) id, c1.column1, c2.column2, c3.column3
  from c1 
       full join c2 on c1.id=c2.id and c1.p=c2.p
       full join c3 on nvl(c1.id,c2.id)=c3.id and nvl(c1.p,c2.p)=c3.p --note NVL usage
;

结果:

OK
id      column1 column2 column3
1       M       1       5
1       S       2       6
1       E       3       7
1       T       4       NULL
2       A       6       2
2       B       5       4
2       C       8       3
2       NULL    7       0
2       NULL    9       1
2       NULL    1       NULL

相关问题