如何将字符串转换为复杂的结构数组并在配置单元中分解

5m1hhzi4  于 2021-05-31  发布在  Hadoop
关注(0)|答案(2)|浏览(444)

我在 hive 下面的table上

id     string 
code   string
config string

价值观:

dummyID|codeA|[{"pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"},{"pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"},{"pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[],"mapb":"c88"},{"pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"},{"pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}]

我需要分解数组如下输出:(struct下的任何元素都可以是可选的)

dummyID|codeA|{"pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"}
dummyID|codeA|{"pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"}
dummyID|codeA|{"pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[{"qtgm":"tt1","swrt":"rr2"}],"mapb":"c88"}
dummyID|codeA|{"pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"}
dummyID|codeA|{"pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}

我试过:

select 
id,
code,
exp_val   
FROM   temp 
LATERAL VIEW explode(array(config)) temp AS exp_val ;

上面的查询没有给出任何错误,但没有分解和获取单行,横向视图内联也不起作用
我尝试用下面的架构创建表,并尝试从上面的字符串配置字段插入记录,但由于数据类型不匹配错误而失败

id    string,
code  string,
config  array<struct<pmc:String,scc:String,pgtp:string,pgn:string,pgrc:string,pscc:Array<String>,sen:Array<String>,mapb:Array<String>>>

当我试图运行selectqueryforconfig时,得到了下面的结果

|dummyID|codeA|{"pmc":"[{\"pmc\":\"111\",\"scc\":\"aa1\",\"pgtp\":\"a22\",\"pgn\":\"a33\",\"pgrc\":\"a44\"},{\"pmc\":\"222\",\"scc\":\"bb1\",\"pgtp\":\"b22\",\"pgn\":\"b33\",\"pgrc\":\"b44\",\"sen\":\"b77\"},{\"pmc\":\"333\",\"scc\":\"cc1\",\"pgtp\":\"c22\",\"pgn\":\"c33\",\"pgrc\":\"c44\",\"pscc\":[],\"mapb\":\"c88\"},{\"pmc\":\"444\",\"scc\":\"dd1\",\"pgtp\":\"d22\",\"pgn\":\"d33\",\"pgrc\":\"d44\",\"pscc\":[\"ghgh\"],\"mapb\":\"d88\"},{\"pmc\":\"555\",\"scc\":\"ee1\",\"pgtp\":\"e22\",\"pgn\":\"e33\",\"pgrc\":\"e44\",\"mapb\":\"e88\"}]","scc":null,"pgtp":null,"pgn":null,"pgrc":null,"pscc":null,"sen":null,"mapb":null}

explode在这个数据集上也不起作用
我有什么遗漏吗?

lxkprmvk

lxkprmvk1#

删除 array 在explode函数中尝试以下操作

select 
 id,
 code,
 exp_val   
FROM temp 
LATERAL VIEW explode(config) temp AS exp_val ;

第二种选择:

select 
 t.id,
 t.code,
 e.*   
FROM temp t
LATERAL VIEW outer inline(t.config) e ;
ryhaxcpt

ryhaxcpt2#

你可以试试这个方法,
第一 SPLIT 要在string类型的数组中对其进行转换的字符串。
第二 Explode 阵列。
举个例子

WITH table AS(
select 'dummyID' AS id,'codeA' AS code, SPLIT('[{"pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"},{"pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"},{"pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[],"mapb":"c88"},{"pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"},{"pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}]','\\{') AS array)
SELECT id,code, exp_val
FROM   table
LATERAL VIEW explode(array) table AS exp_val;

输出

+----------+--------+----------------------------------------------------+--+
|    id    |  code  |                      exp_val                       |
+----------+--------+----------------------------------------------------+--+
| dummyID  | codeA  | [                                                  |
| dummyID  | codeA  | "pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"}, |
| dummyID  | codeA  | "pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"}, |
| dummyID  | codeA  | "pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[],"mapb":"c88"}, |
| dummyID  | codeA  | "pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"}, |
| dummyID  | codeA  | "pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}] |
+----------+--------+----------------------------------------------------+--+

相关问题