我编写了一个代码,将文件保存到hadoop的序列文件中。键是文件名,值是文件的字节数组。输出是序列文件和.crc文件
之后,我试图读取序列文件,但在校验和方面出现了一个异常:
Exception in thread "main" org.apache.hadoop.fs.ChecksumException: Checksum error: file:/home/mosab/Desktop/output/ProcessWS/sequence.seq at 18873344
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:259)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:276)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:228)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2436)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2335)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2381)
at sequence.extractor.Extractor.main(Extractor.java:36)
我试图删除.crc文件,并再次读取序列文件,但后来我得到了eofexception
有什么解决办法吗?
1条答案
按热度按时间wgmfuz8q1#
解决方案:checksumexception是因为我在完成写入/追加后忘记关闭序列编写器。这将导致序列文件与其crc文件不匹配。