json数据如下所示:
{"id":"U101", "name":"Rakesh", "place":{"city":"MUMBAI","state":"MAHARASHTRA"}, "age":20, "occupation":"STUDENT"}
{"id":"","name":"Rakesh", "place":{"city":"MUMBAI","state":"MAHARASHTRA"}, "age":20, "occupation":"STUDENT"}
{"id":"U103", "name":"Rakesh", "place":{"city":"","state":""}, "age":20, "occupation":"STUDENT"}
我在尝试 select
表中的数据:
hive (ecom)> select * from users_info_raw;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException:
org.codehaus.jackson.JsonParseException: Unexpected character ('2'
(code 50)): was expecting comma to separate OBJECT entries at
[Source: java.io.StringReader@15b0734; line: 1, column: 222]
Time taken: 0.144 seconds
创建表ddl查询:
CREATE TABLE users_info_raw(
> id string,
> name string,
> place struct<city:string,state:string>,
> age INT,
> occupation string
> )
> ROW FORMAT SERDE
> 'com.cloudera.hive.serde.JSONSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
1条答案
按热度按时间2exbekwf1#
我使用了hivehcatalogserde,它可以很好地处理您的输入数据。![](https://i.stack.imgur.com/hmAsA.png)
CREATE TABLE info_raw( id string, name string, place struct<city:string,state:string>, age INT, occupation string ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';