为嵌套的json数据创建配置单元表

iezvtpos  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(348)

我无法将嵌套的json数据加载到配置单元表中。有人能帮我吗?下面是我尝试过的:
样本输入:

  1. {"DocId":"ABC","User1":{"Id":1234,"Username":"sam1234","Name":"Sam","ShippingAddress":{"Address1":"123 Main St.","Address2":null,"City":"Durham","State":"NC"},"Orders":[{"ItemId":6789,"OrderDate":"11/11/2012"},{"ItemId":4352,"OrderDate":"12/12/2012"}]}}

在Hive(cdh3)上:

  1. ADD JAR /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;
  2. CREATE TABLE json_tab(
  3. DocId string,
  4. user1 struct<Id: int, Username: string, Name:string,ShippingAddress:struct<address1:string,address2:string,city:string,state:string>,orders:array<struct<ItemId:int,orderdate:string>>>
  5. )
  6. ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
  7. STORED AS TEXTFILE;
  8. hive> select * from json_tab;
  9. OK
  10. NULL null

我要走了 NULL 他在这儿。
还尝试了hcatalog jar:

  1. ADD JAR /home/training/Desktop/hcatalog-core-0.11.0.jar;
  2. CREATE TABLE json_tab(
  3. DocId string,
  4. user1 struct<Id: int, Username: string, Name:string,ShippingAddress:struct<address1:string,address2:string,city:string,state:string>,orders:array<struct<ItemId:int,orderdate:string>>>
  5. )
  6. ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';

但面对我的错误 create table 声明:
失败:元数据错误:无法验证serde:org.apache.hive.hcatalog.data.jsonserde失败:执行错误,从org.apache.hadoop.hive.ql.exec.ddltask返回代码1
有人能帮我吗?提前谢谢你的帮助。

irtuqstp

irtuqstp1#

您可以使用org.openx.data.jsonserde.jsonserde类来创建json数据
您可以从下载jar文件http://www.congiu.net/hive-json-serde/1.3.6-snapshot/cdh4/
并执行以下步骤

  1. add jar /path/to/jar/json-serde-1.3.6-jar-with-dependencies.jar;
  2. CREATE TABLE json_tab(
  3. DocId string,
  4. user1 struct<Id: int, Username: string, Name:string,ShippingAddress:struct<address1:string,address2:string,city:string,state:string>,orders:array<struct<ItemId:int,orderdate:string>>>
  5. )
  6. ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
  7. LOAD DATA LOCAL INPATH '/path/to/data/nested.json' INTO TABLE json_tab;
  8. SELECT DocId, User1.Id, User1.ShippingAddress.City as city,
  9. User1.Orders[0].ItemId as order0id,
  10. User1.Orders[1].ItemId as order1id from json_tab;
  11. result
  12. ABC 1234 Durham 6789 4352
展开查看全部
yrdbyhpb

yrdbyhpb2#

  1. I was getting same exception.

我加了下面的jar,它对我有用。

  1. ADD JAR /home/cloudera/Data/json-serde-1.3.7.3.jar;
  2. ADD JAR /home/cloudera/Data/hive-hcatalog-core-0.13.0.jar;

相关问题