hive将多个Map值合并到一列中

ego6inou  于 2021-06-25  发布在  Hive
关注(0)|答案(2)|浏览(906)

表(输入数据)

+--------+---------+
| col_id | col_val |
+--------+---------+
| axc    | col_x   |
| bdf    | col_f   |
| cde    | col_x   |
| yhc    | col_f   |
| idx    | col_a   |
| dft    | col_b   |
+--------+---------+

转换逻辑。基本上是想把列a或列b的值组合成一列作为列ab。尝试如下。但是遇到了一个错误。请帮忙

SELECT a.col_id,IF(array_contains(collect_list(a.map_values['col_x']),'1'),'Y','N') AS col_x,
IF(array_contains(collect_list(a.map_values['col_y']),'1'),'Y','N') AS col_y,
IF(array_contains(collect_list(a.map_values['col_a']),'1'),'Y','N') OR IF(array_contains(collect_list(a.map_values['col_b']),'1'),'Y','N') AS col_ab,
IF(array_contains(collect_list(a.map_values['col_f']),'1'),'Y','N') AS col_f 
 FROM (
       SELECT col_id,
              col_val,
              map(col_val, '1') map_values
       FROM   transpose) a GROUP BY a.col_id;

输出数据

+--------+-------+-------+-----------+-------+
| col_id | col_x | col_y | col_ab    | col_f |
+--------+-------+-------+-----------+-------+
| axc    | Y     | N     | N         | N     |
| bdf    | N     | N     | N         | Y     |
| cde    | Y     | N     | N         | N     |
| dft    | N     | Y     | Y         | N     |
| idx    | N     | N     | Y         | N     |
| yhc    | N     | N     | N         | Y     |
+--------+-------+-------+-----------+-------+
o2gm4chl

o2gm4chl1#

SELECT col_id, 
       IF(map_values['col_x'],'Y','N') as col_x,
       IF(map_values['col_y'],'Y','N') as col_y,
       IF((map_values['col_a'] || map_values['col_b']),'Y','N') as col_ab, 
       IF(map_values['col_f'],'Y','N') as col_f
FROM
(SELECT col_id,
       to_map(col_val, true) map_values
FROM   transpose 
GROUP BY col_id)a
nlejzf6q

nlejzf6q2#

带有聚合的用例语句:

select  col_id, 
        max( case col_val = 'col_x' then 'Y' alse 'N' end )               as col_x, 
        max( case col_val = 'col_y' then 'Y' alse 'N' end )               as col_y,
        max( case col_val in ( 'col_a', 'col_b')  then 'Y' alse 'N' end ) as col_ab,
        max( case col_val = 'col_f' then 'Y' alse 'N' end )               as col_f
from your_table
 group by col_id;

相关问题