hivejson数据解析

7eumitmz 于 2021-05-29 发布在 Hadoop

关注(0)|答案(2)|浏览(264)

我的json数据在json\表和列中是这样的：json\ col

{
    "href": "example.com",
    "Hosts": {
        "cluster_name": "test",
        "host_name": "test.iabc.com"
    },
    "metrics": {
        "cpu": {
            "cpu_user": [
                [
                    0.7,
                    1499795941
                ],
                [
                    0.3,
                    1499795951
                ]
            ]
        }
    }
}

我想用下面的格式把这个数据放到一个表中

+-------------+-------+------------+
| metric_type | value | timestamp  |
+-------------+-------+------------+
| cpu_user    | 0.7   | 1499795941 |
+-------------+-------+------------+
| cpu_user    | 0.3   | 1499795951 |
+-------------+-------+------------+

我尝试使用get\u json\u object获取值

select get_json_object(json_col,'$.metrics.cpu.cpu_user[1]') from json_table

，这给了我

[0.3,1499795951]

如何从这里使用explode函数来获得所需的输出？

hadoop Hive JSON

来源：https://stackoverflow.com/questions/45067080/hive-json-data-parsing

2条答案

按热度按时间

nwo49xxi1#

您还可以基于json数据实现serde和inputformat接口，而不是使用udf。
以下是一些参考：
http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/httpshttp://github.com/xjtuzxh/inceptor-inputformat

赞(0）回复(0）举报 2021-05-29

bhmjp9jg2#

select  'cpu_user'      as metric_type 
       ,val_ts[0]       as val
       ,val_ts[1]       as ts

from   (select  split(m.col,',') as val_ts

        from    json_table j
                lateral view explode(split(regexp_replace(get_json_object(json_col,'$.metrics.cpu.cpu_user[*]'),'^\\[\\[|\\]\\]$',''),'\\],\\[')) m
        ) m
;

+-------------+-----+------------+
| metric_type | val |     ts     |
+-------------+-----+------------+
| cpu_user    | 0.7 | 1499795941 |
| cpu_user    | 0.3 | 1499795951 |
+-------------+-----+------------+

赞(0）回复(0）举报 2021-05-29

我来回答

hivejson数据解析

2条答案

相关问题

热门标签

最新问答