hive:在查询中将array< string>转换为array< int>

pdsfdshx  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(4824)

我有两张table:

create table a (
`1` array<string>);

create table b (
`1` array<int>);

我想把表a放在表b中(表b是空的):

insert into table b
select * from a;

执行此操作时,出现以下错误:

FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into
target table because column number/types are different 'b': Cannot
convert column 0 from array<string> to array<int>.

然而,如果字段仅为类型,则不会出现此错误 string 以及 int .
有没有一种方法可以对数组进行强制转换?

yrdbyhpb

yrdbyhpb1#

有没有一种方法可以对数组进行强制转换?
不容易。如果知道数组的大小,可以手动强制转换数组,但如果不知道,则可能需要使用结构。看看我对这个问题的回答。
另外:我不能对另一个答案进行向下投票,但对于包含多个数组的嵌套选择则失败。
它不强制转换数组元素并重建原始数组,而是强制转换所有元素,然后将其合并到单个数组中。例子:

hive> select id, my_array from array_table limit 3;
OK
10023307    ["0.20296966","0.17753501","-0.03543373"]
100308007   ["0.16155224","0.1945944","0.09167781"]
100384207   ["0.025892768","0.023214806","-0.003712816"]

hive> select
    >     collect_list(cast(array_element as double)) int_array
    > from (
    >     select
    >         explode(my_array) array_element
    >     from (
    >         select
    >             my_array
    >         from array_table limit 3
    >     ) X
    > ) s;
OK
[0.20296966,0.17753501,-0.03543373,0.16155224,0.1945944,0.09167781,0.025892768,0.023214806,-0.003712816]
t1rydlwq

t1rydlwq2#

使用重新组装阵列 explode() 以及 collect_list() .
初始字符串数组示例:

hive> select array('1','2','3') string_array;
OK
string_array
["1","2","3"]
Time taken: 1.109 seconds, Fetched: 1 row(s)

转换数组:

hive> select collect_list(cast(array_element as int)) int_array --cast and collect array
       from( select explode(string_array) array_element         --explode array
               from (select array('1','2','3') string_array     --initial array
                    )s 
           )s;

结果:

OK
int_array
[1,2,3]
Time taken: 44.668 seconds, Fetched: 1 row(s)

如果您想在insert+select查询中添加更多的列,那么使用 lateral view [outer] :

select col1, col2, collect_list(cast(array_element as int)) int_array
 from
(
select col1, col2 , array_element         
  from table
       lateral view outer explode(string_array) a as array_element         
)s
group by col1, col2
;
muk1a3rh

muk1a3rh3#

brickhouse震击器比铸造它们并收集它们作为一个列表要快得多。将此jar添加到hdfs位置。
使用下面的链接下载brick house jar

add jar hdfs://hadoop-/pathtojar/brickhouse-0.7.1.jar;   
create temporary function cast_array as 'brickhouse.udf.collect.CastArrayUDF';   
select CAST(columns, 'int') AS columname from table;  
select CAST(columns, 'string') AS columname from table

相关问题