我试图读取csv文件作为输入数据,并写在avro格式的输出。
注:pig版本apache pig版本0.12.1.2.1.5.0-695
REGISTER /usr/lib/pig/lib/avro-1.7.4.jar;
REGISTER /usr/lib/pig/lib/piggybank.jar;
REGISTER /usr/lib/pig/lib/jackson-mapper-asl-1.8.8.jar;
REGISTER /usr/lib/pig/lib/jackson-core-asl-1.8.8.jar;
REGISTER /usr/lib/pig/lib/json-simple-1.1.1.jar;
A = LOAD '/data/raw/event';
store A into '/data/dev/raw/pig'
using org.apache.pig.piggybank.storage.avro.AvroStorage('no_schema_check',
'schema', ' {
"name" : "EVENT",
"type" : "record",
"fields" : [ {
"name" : "evt",
"type" : [ "long", "null" ]
}, {
"name" : "mac",
"type" : [ "int", "null" ]
}, {
"name" : "sec",
"type" : [ "int", "null" ]
} ]
}');
我得到以下例外:
ERROR 2997: Unable to recreate exception from backed error: Error: org.apache.avro.file.DataFileWriter$AteException: java.lang.RuntimeException:
Unsupported type in record:class org.apache.pig.data.DataByteArray
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
at org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49)
at org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:749)
Caused by: java.lang.RuntimeException: Unsupported type in record:class org.apache.pig.data.DataByteArray
at org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.getField(PigAvroDatumWriter.java:385)
at org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.writeRecord(PigAvroDatumWriter.java:363)
请让我知道如果我错过了任何东西或如果有任何工作的存在
1条答案
按热度按时间ycl3bljg1#
默认情况下,pig将把所有字段作为databytearray加载。因此,您必须按如下方式用schema加载数据