我正在尝试使用mrunit测试hadoop.mapreduce avro作业。我收到一个nullpointerexception,如下所示。我附上了pom的一部分和源代码。任何协助都将不胜感激。
谢谢
我得到的错误是:
java.lang.NullPointerException
at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:73)
at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)
at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)
at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)
at org.apache.hadoop.mrunit.MapDriverBase.setInputKey(MapDriverBase.java:64)
at org.apache.hadoop.mrunit.MapDriverBase.setInput(MapDriverBase.java:104)
at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:218)
at org.lab41.project.mapreduce.ParseMetadataAsTextIntoAvroTest.testMap(ParseMetadataAsTextIntoAvroTest.java:115)
.....
pom代码段:
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>0.9.0-incubating</version>
<classifier>hadoop2</classifier>
<scope>test</scope>
</dependency>
<avro.version>1.7.4</avro.version>
<hadoop.version>2.0.0-mr1-cdh4.1.3</hadoop.version>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>${avro.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>${avro.version}</version>
<classifier>hadoop2</classifier>
</dependency>
以下是测试摘录:
import static org.junit.Assert.*;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import org.apache.avro.mapred.AvroKey;
import org.apache.avro.hadoop.io.AvroSerialization;
import org.apache.avro.mapred.AvroValue;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.types.Pair;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import org.lab41.project.domain.DataRecord;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class ParseMetadataAsTextIntoAvroTest {
Logger logger = LoggerFactory
.getLogger(ParseMetadataAsTextIntoAvroTest.class);
private MapDriver<LongWritable, Text, AvroKey<Long>, AvroValue<DataRecord>> mapDriver;
@BeforeClass
public static void setUpClass() {
}
@AfterClass
public static void tearDownClass() {
}
@Before
public void setUp() throws IOException {
ParseMetadataAsTextIntoAvroMapper mapper = new ParseMetadataAsTextIntoAvroMapper();
mapDriver = new MapDriver<LongWritable, Text, AvroKey<Long>, AvroValue<DataRecord>>();
mapDriver.setMapper(mapper);
mapDriver.getConfiguration().setStrings("io.serializations", new String[]{
AvroSerialization.class.getName()
});
}
@Test
public void testMap() throws ParseException, IOException {
Text testInputText = new Text(test0);
DataRecord record = new DataRecord();
….
AvroKey<Long> expectedPivot = new AvroKey<Long>(1L);
AvroValue<DataRecord> expectedRecord = new AvroValue<DataRecord>(record);
mapDriver.withInput(new Pair<LongWritable, Text>(new LongWritable(1), testInputText));
mapDriver.withOutput(new Pair<AvroKey<Long>, AvroValue<DataRecord>>(expectedPivot, expectedRecord));
mapDriver.runTest();
}
}
4条答案
按热度按时间4si2a6ki1#
你必须加上
AvroSerialization
默认序列化并配置AvroSerialization
.af7jpaap2#
回答如下:https://issues.apache.org/jira/browse/mrunit-181 明确地:https://cwiki.apache.org/confluence/display/mrunit/mrunit+with+avro
kq4fsx7k3#
这也解决了这个问题,具有代码更短、更清晰的优点。
可能是mrunit的bug,没有正确设置io.serializations,应该是job.setinputformatclass(avrokeyinputformat.class)设置的。
yh2wf1be4#
为了让这个工作,你必须添加
AvroSerializatio
默认序列化。您还必须配置AvroSerializationn
.