flume在一行2048个字符后添加换行符

sulc1iza  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(396)

我在ubuntu工作站上运行了一个Flume1.5代理,它从各种设备收集日志,并将日志重新格式化为一个逗号分隔的文件,其中包含很长的行。收集并重新格式化日志后,它们被放置到spool目录中,flume代理将日志文件发送到运行flume代理的hadoop服务器,以接受日志文件并将其放置在hdfs目录中。
一切正常,除了当flume将文件发送到hdfs目录时,每行中每2048个字符后面都有换行符。
下面是我的flume配置文件。有没有设置告诉flume不要插入换行符?


# On Ubuntu Workstation

# list sources, sinks and channels in the agent

agent.sources = axon_source
agent.channels = memorychannel
agent.sinks = AvroOut

# define flow

agent.sources.axon_source.channels = memorychannel
agent.sinks.AvroOut.channel = memorychannel
agent.channels.memorychannel.type = memory
agent.channels.memorychannel.capacity = 100000

# source

agent.sources.axon_source.type = spooldir
agent.sources.axon_source.spoolDir = /home/ubuntu/workspace/logdump
agent.sources.axon_source.decodeErrorPolicy = ignore

# avro out

agent.sinks.AvroOut.type = avro
agent.sinks.AvroOut.hostname = 172.31.12.221
agent.sinks.AvroOut.port = 41415
agent.sinks.AvroOut.maxIoWorkers = 2

------------------------------------------------------------

# On Hadoop Server

agent.sources = AvroIn
agent.sources.AvroIn.type = avro
agent.sources.AvroIn.bind = 172.31.131.1
agent.sources.AvroIn.port = 41415
agent.sources.AvroIn.channels = MemChan1

agent.channels = MemChan1
agent.channels.MemChan1.type = memory
agent.channels.MemChan1.capacity = 100000

agent.sinks = HDFSSink
agent.sinks.HDFSSink.type = hdfs
agent.sinks.HDFSSink.channel = MemChan1
agent.sinks.HDFSSink.hdfs.path = /Logs/%Y%m/
agent.sinks.HDFSSink.hdfs.filePrefix = axoncapture
agent.sinks.HDFSSink.hdfs.fileSuffix = .log
agent.sinks.HDFSSink.hdfs.minBlockReplicas = 1
agent.sinks.HDFSSink.hdfs.rollCount = 0
agent.sinks.HDFSSink.hdfs.rollSize = 314572800
agent.sinks.HDFSSink.hdfs.writeFormat = Text
agent.sinks.HDFSSink.hdfs.fileType = DataStream
agent.sinks.HDFSSink.hdfs.useLocalTimeStamp = True
bq9c1y66

bq9c1y661#

找到了我问题的答案:
行反序列化程序的默认maxlinelength为2048:http://archive.cloudera.com/cdh5/cdh/5/flume-ng/flumeuserguide.html#line
我将该行添加到flume.conf文件中,并修复了问题:agent.sources.axon\u source.deserializer.maxlinength=60000

相关问题