flume加载csv文件优于hdfs接收器

yrefmtwq  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(450)

我已将Flume源配置为spooldir类型。我有很多csv文件,.xl3和.xls,我希望我的flume代理将所有文件从spooldir加载到hdfs接收器。但是flume代理返回异常
这是我对flume源的配置:

agent.sources.s1.type = spooldir
agent.sources.s1.spoolDir = /my-directory
agent.sources.s1.basenameHeader = true
agent.sources.batchSize = 10000

我的hdfsFlume:

agent.sinks.sk1.type = hdfs 
agent.sinks.sk1.hdfs.path = hdfs://...:8020/user/importflume/%Y/%m/%d/%H 
agent.sinks.sk1.hdfs.filePrefix = %{basename}
agent.sinks.sk1.hdfs.rollSize = 0
agent.sinks.sk1.hdfs.rollCount = 0
agent.sinks.sk1.hdfs.useLocalTimeStamp = true
agent.sinks.sk1.hdfs.batchsize =    10000
agent.sinks.sk1.hdfs.fileType = DataStream
agent.sinks.sk1.serializer = avro_event
agent.sinks.sk1.serializer.compressionCodec = snappy
z9smfwbn

z9smfwbn1#

您可以对spool dir使用以下配置。只需在下面的配置中给出本地文件系统的路径和hdfs位置。


# Flume Configuration Starts

# Define a file channel called fileChannel on agent1

agent1.channels.fileChannel1_1.type = file 

# on linux FS

agent1.channels.fileChannel1_1.capacity = 200000
agent1.channels.fileChannel1_1.transactionCapacity = 1000

# Define a source for agent1

agent1.sources.source1_1.type = spooldir

# on linux FS

# Spooldir in my case is /home/hadoop/Desktop/flume_sink

agent1.sources.source1_1.spoolDir = 'path'
agent1.sources.source1_1.fileHeader = false
agent1.sources.source1_1.fileSuffix = .COMPLETED
agent1.sinks.hdfs-sink1_1.type = hdfs

# Sink is /flume_import under hdfs

agent1.sinks.hdfs-sink1_1.hdfs.path = hdfs://'path'
agent1.sinks.hdfs-sink1_1.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink1_1.hdfs.rollSize = 268435456
agent1.sinks.hdfs-sink1_1.hdfs.rollInterval = 0
agent1.sinks.hdfs-sink1_1.hdfs.rollCount = 50000000
agent1.sinks.hdfs-sink1_1.hdfs.writeFormat=Text

agent1.sinks.hdfs-sink1_1.hdfs.fileType = DataStream
agent1.sources.source1_1.channels = fileChannel1_1
agent1.sinks.hdfs-sink1_1.channel = fileChannel1_1

agent1.sinks =  hdfs-sink1_1
agent1.sources = source1_1
agent1.channels = fileChannel1_1

你也可以参考这个关于flume spool dir的博客来获得更多信息。

相关问题