netcatsource:发送的客户端超过了最大长度

iyfjxgzm  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(474)

大家好,提前感谢大家花时间阅读本文:)我正试图在hadoop集群中发送一个json对象,用spark处理它,这个json大约是15kb。我将flume代理设置为:

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 41400
a1.sources.r1.max-line-length = 512000
a1.sources.r1.eventSize = 512000

# a1.sources.deserializer.maxLineLength = 512000

# Describe the sink

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /hadoop/hdfs/data
a1.sinks.k1.hdfs.filePrefix = CDR
a1.sinks.k1.hdfs.callTimeout = 15000
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 226
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.batchSize = 226

# Use a channel which buffers events in memory

a1.channels.c1.type = file
a1.channels.c1.capacity = 512000
a1.channels.c1.transactionCapacity =512000 

# Bind the source and sink to the channel

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

除此之外,我有一个perl脚本,它通过指定端口的套接字发送json对象,但是当我启动flume代理时,我得到以下消息:

WARN source.NetcatSource: Client sent event exceeding the maximum length

我不明白的是,我把我的事件的最大行长度设置为512000字节,大于15KB,有人能帮我吗?谢谢,对不起,我的英语不好

l0oc07j2

l0oc07j21#

您可以验证json(在perl脚本上)是否以换行符(eol)结束。
参考文件:https://flume.apache.org/flumeuserguide.html#netcat-来源

相关问题