配置单元支持avro logicaltype

xkftehaa  于 2021-06-25  发布在  Hive
关注(0)|答案(0)|浏览(270)

我用tdch创建了一个avro文件。从avro文件生成的模式如图所示。
用于生成avro文件的jar文件:paranamer-2.3.jar、avro-1.9.2.jar、avro-mapred-1.9.2.jar

$ java -jar avro-tools-1.9.2.jar getschema part-m-00000.avro
20/03/06 11:36:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
{
  "type" : "record",
  "name" : "AVrotdch",
  "namespace" : "AvroTDCH",
  "fields" : [ {
    "name" : "varchar_field",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "int_field",
    "type" : [ "null", "int" ],
    "default" : null
  }, {
    "name" : "bigint_field",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "decimal_field",
    "type" : [ "null", "double" ],
    "default" : null
  }, {
    "name" : "date_field",
    "type" : [ "null", {
      "type" : "int",
      "logicalType" : "date"
    } ],
    "default" : null
  } ]
}

java -jar avro-tools-1.9.2.jar tojson part-m-00000.avro
20/03/06 12:00:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
{"varchar_field":{"string":"A"},"int_field":{"int":1},"bigint_field":{"long":10},"decimal_field":{"double":11.12},"date_field":{"int":1328263200}}
{"varchar_field":{"string":"B"},"int_field":{"int":2},"bigint_field":{"long":11},"decimal_field":{"double":34.09},"date_field":{"int":1328090400}}
{"varchar_field":{"string":"C"},"int_field":{"int":3},"bigint_field":{"long":12},"decimal_field":{"double":33.9},"date_field":{"int":1328176800}}

我将avro模式和avro文件上传到hdfs,并在其上创建了hive外部表

CREATE EXTERNAL TABLE AvroTDCH 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' 
LOCATION '/data/sample/avro/tdch_test' 
TBLPROPERTIES ('avro.schema.url'='hdfs://NN01/data/sample/avroSchema/AVrotdchcomplex.avsc');

但当我试图从配置单元表读取数据时,它抛出如下异常。

> select * from AvroTDCH ;
    java.lang.NumberFormatException: For input string: " 631"
            at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
            at java.lang.Integer.parseInt(Integer.java:569)
            at java.lang.Integer.parseInt(Integer.java:615)
            at java.sql.Date.valueOf(Date.java:133)
            at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:447)
            at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:423)
            at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:536)
            at org.apache.hive.beeline.Rows$Row.<init>(Rows.java:166)
            at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:53)
            at org.apache.hive.beeline.IncrementalRowsWithNormalization.<init>(IncrementalRowsWithNormalization.java:50)
            at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1820)
            at org.apache.hive.beeline.Commands.execute(Commands.java:878)
            at org.apache.hive.beeline.Commands.sql(Commands.java:730)
            at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1000)
            at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:835)
            at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:793)
            at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:493)
            at org.apache.hive.beeline.BeeLine.main(BeeLine.java:476)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
            at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
    Error: Unrecognized column type:DATE_TYPE (state=,code=0)

我怀疑jar版本不匹配可能是这里的问题。我还尝试过使用addjar命令在beeline中添加1.9.2jar,但也遇到了相同的错误。
hdp配置单元客户端中的avro jar
$ls-lrt/usr/hdp/current/hive-client/lib/avro-rw-r--r--1根400680 2019年4月25日/usr/hdp/current/hive-client/lib/avro-1.7.5.jar

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题