如何提高FlumeFlume性能

mbjcgjjk  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(483)

我正在通过flume接收器将事件从flume发送到kafka主题….为此,我使用文件通道,但在写入kafka时性能非常差flume接收器以每秒190 msg的速率写入,而flume源以每秒3000 msg的速率将事件转储到通道….我想增加我的接收器性能…请帮助我如何实现…我尝试了各种配置…这是我的配置文件-----

agent1.sources = AspectJ
agent1.channels = fileTailChannel
agent1.sinks = APMNullSink

# AspectJ source

agent1.sources.AspectJ.type=com.flume.test.DumbSource
agent1.sources.AspectJ.path=/media/sf_New_Log/calltracedata.txt
agent1.sources.AspectJ.eventtype=CallTrace
agent1.sources.AspectJ.channels=fileTailChannel
agent1.sources.AspectJ.batchSize=1000
agent1.sources.AspectJ.batchDurationMillis=1000
agent1.sources.AspectJ.application = AspectJ
agent1.sources.AspectJ.multi.line.mode=true
agent1.sources.AspectJ.new.event.marker={
agent1.sources.AspectJ.event.terminator=}
agent1.sources.AspectJ.tailer.start.reading.from.end = false

# File Tail Channel

agent1.channels.fileTailChannel.type = file
agent1.channels.fileTailChannel.checkpointDir = /tmp/flume/filechannel/checkpoint
agent1.channels.fileTailChannel.dataDirs = /tmp/flume/filechannel/data
agent1.channels.fileTailChannel.transactionCapacity=100000

# APM Null Sink

agent1.sinks.APMNullSink.type = com.flume.test.TestJSON2KafkaSink
agent1.sinks.APMNullSink.channel = fileTailChannel
agent1.sinks.APMNullSink.testSize = 1000
agent1.sinks.APMNullSink.zk.connect=sandbox.hortonworks.com:2181
agent1.sinks.APMNullSink.metadata.broker.list=sandbox.hortonworks.com:6667
agent1.sinks.APMNullSink.topic=Test
agent1.sinks.APMNullSink.producer.type=async
agent1.sinks.APMNullSink.serializer.class=kafka.serializer.StringEncoder
agent1.sinks.APMNullSink.batch.num.messages=1000
agent1.sinks.APMNullSink.batchSize=50000
agent1.sinks.APMNullSink.batchDurationMillis=1000
agent1.sinks.APMNullSink.queue.buffering.max.ms=5000
agent1.sinks.APMNullSink.queue.buffering.max.messages=100000
agent1.sinks.APMNullSink.send.buffer.bytes=2097152
agent1.sinks.APMNullSink.compression.codec=snappy
cqoc49vn

cqoc49vn1#

我认为你的问题是,所有的东西都在一个盒子上运行,所以文件通道的快速接收正在影响kafka,它需要做更多的工作来编写。
我建议两种选择:
将文件通道的容量设置为限制队列大小,这样就不会比kafka接收的速度快。
或者使用Kafka频道:https://flume.apache.org/flumeuserguide.html#kafka-频道。但是,在这个选项中,主题中的消息被 Package 到avroflumeevent中,因此订阅者需要使用该类反序列化。
我也不明白为什么你需要自己的com.flume.test.testjson2kafkasink而不是flume附带的那个。也许你的代码中有一些性能问题。

相关问题