从sequencefile读取时,协议缓冲区中的一个字段总是丢失

1sbrub3j  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(264)

一些神秘的事情发生在我身上:
我想做的是:

1. Save a Protocol Buffers object as SequenceFile format.
2. Read this SequenceFile text and extract the field that I need.

神秘之处在于:我想要检索的一个字段总是空的。 Product_Perf 是我想从总是丢失的序列文件中提取的字段。
以下是我的协议缓冲区模式:

message ProductJoin {
Signals signals = 1;
int64 id = 2;
}

message Signals {
ProductPerf product_perf = 1;
}

message ProductPerf {
    int64 impressions = 1;
}

下面是如何将协议缓冲区保存为SequenceFile:

JavaPairRDD<BytesWritable, BytesWritable> bytesWritableJavaPairRdd =
                flattenedPjPairRdd.mapToPair(
                    new PairFunction<Tuple2<Long, ProductJoin>, BytesWritable, BytesWritable>() {

                        @Override
                        public Tuple2<BytesWritable, BytesWritable> call(Tuple2<Long, ProductJoin> longProductJoinTuple2) throws Exception {
                                return new Tuple2<>(
new BytesWritable(longProductJoinTuple2._2().getId().getBytes()),
new BytesWritable(longProductJoinTuple2._2().toByteArray()));
                            }
                        }
    //dump SequenceFiles
                bytesWritableJavaPairRdd.saveAsHadoopFile(
                    "/tmp/path/",
                    BytesWritable.class,
                    BytesWritable.class,
                    SequenceFileOutputFormat.class
                );

下面是我如何读取sequencefile的代码:

sparkSession.sparkContext()
            .sequenceFile("tmp/path", BytesWritable.class, BytesWritable.class)
            .toJavaRDD()
            .mapToPair(
                bytesWritableBytesWritableTuple2 -> {
                    Method parserMethod = clazz.getDeclaredMethod("parser");
                    Parser<T> parser = (Parser<T>) parserMethod.invoke(null);
                    return new Tuple2<>(
                        Text.decode(bytesWritableBytesWritableTuple2._1().getBytes()),
                        parser.parseFrom(bytesWritableBytesWritableTuple2._2().getBytes()));
                }
            );

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题