flume-有没有办法将avro事件(头和体)存储到hdfs中？

6bc51xsx 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(256)

Flume新手。。。
我正在接收avro事件并将它们存储到hdfs中。
我知道默认情况下，只有事件的主体存储在hdfs中。我还知道有一个avro\u事件序列化程序。但我不知道这个序列化程序到底在做什么？它如何影响Flume的最终输出？
另外，我不知道如何将事件转储到hdfs中，以保留其头信息。我需要自己写序列化程序吗？

hadoop hdfs flume flume-ng

来源：https://stackoverflow.com/questions/33379438/flume-is-there-a-way-to-store-avro-event-header-body-into-hdfs

1条答案

按热度按时间

bxgwgixi1#

结果是序列化程序 avro_event 在文件中同时存储头和体。
下面是我如何设置Flume：

a1.sinks.i1.type=hdfs
a1.sinks.i1.hdfs.path=hdfs://localhost:8020/user/my-name
a1.sinks.i1.hdfs.rollInterval=0
a1.sinks.i1.hdfs.rollSize=1024
a1.sinks.i1.hdfs.rollCount=0
a1.sinks.i1.serializer=avro_event
a1.sinks.i1.hdfs.fileType=DataStream

我使用打包的代理发送事件 avro-client ，通过使用 -R headerFile 选项。
标题文件内容：

machine=localhost
user=myName

最后，我用一个简单的java应用程序测试了结果：

final FileSystem fs = FileSystem.get(getConf());
        final Path path = new Path(fs.getHomeDirectory(), "FlumeData.1446072877536");

        printWriter.write(path + "-exists: " + fs.exists(path));

        final SeekableInput input = new FsInput(path, getConf());
        final DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
        final FileReader<GenericRecord> fileReader = DataFileReader.openReader(input, reader);

        for (final GenericRecord datum : fileReader) {
            printWriter.write("value = " + datum);
        }

        fileReader.close();

当然，我看到了每条记录的标题，这里有一行：

value = {"headers": {"machine": "localhost", "user": "myName"}, "body": {"bytes": "set -x"}}

还有另一个序列化程序也会发出头文件，即头文件和文本序列化程序。生成的文件是一个人类可读的文本文件。下面是一个示例行：

{machine=localhost, user=userName} set -x

最后，在ApacheFlume-DistributedLogCollectionforHadoop中，提到了 header_and_text 但我没法用。

赞(0）回复(0）举报 2021-05-29

我来回答

flume-有没有办法将avro事件(头和体)存储到hdfs中？

1条答案

相关问题

热门标签

最新问答