flume tail文件

0x6upsns  于 2021-06-04  发布在  Hadoop
关注(0)|答案(1)|浏览(418)

我是新的Flumeng和需要帮助尾随文件。我有一个集群运行hadoop和flume远程运行。我使用putty与这个集群通信。我想在我的电脑上跟踪一个文件,并把它放在集群的hdfs上。我使用下面的代码来实现这个。


# flume.conf: http source, hdfs sink

# Name the components on this agent

tier1.sources = r1
tier1.sinks = k1
tier1.channels = c1

# Describe/configure the source

tier1.sources.r1.type = exec
tier1.sources.r1.command = tail -F /(Path to file on my PC)

# Describe the sink

tier1.sinks.k1.type = hdfs
tier1.sinks.k1.hdfs.path = /user/ntimbadi/flume/
tier1.sinks.k1.hdfs.filePrefix = events-
tier1.sinks.k1.hdfs.round = true
tier1.sinks.k1.hdfs.roundValue = 10
tier1.sinks.k1.hdfs.roundUnit = minute

 # Use a channel which buffers events in memory
 tier1.channels.c1.type = memory
 tier1.channels.c1.capacity = 1000
 tier1.channels.c1.transactionCapacity = 100

 # Bind the source and sink to the channel
 tier1.sources.r1.channels = c1
 tier1.sinks.k1.channel = c1

我相信错误在源头。这种来源不采取主机名或IP寻找(在这种情况下,应该是我的电脑)。有人能不能给我一个提示,告诉我如何在我的电脑上跟踪一个文件,用flume上传到远程hdfs。

txu3uszq

txu3uszq1#

这个 exec 配置中的源将在启动Flume的计算机上运行 tier1 探员。如果你想从另一台机器上收集数据,你也需要在那台机器上启动一个flume代理;总而言之,您需要:
代理人( remote1 )在具有 avro 源,它将侦听来自收集器代理的事件,并将充当聚合器。
代理人( local1 )在你的机器上运行(像一个收集器一样),它有一个 exec 源并通过将数据发送到远程代理 avro Flume。
或者,您也可以只在本地计算机上运行一个flume代理(与您发布的配置相同),并将hdfs路径设置为“hdfs://remote_ip/hdfs/path“(尽管我不能完全肯定这会奏效)。
编辑:下面是2-agents场景的示例配置(如果不做一些修改,它们可能无法工作)。

remote1.channels.mem-ch-1.type = memory

remote1.sources.avro-src-1.channels = mem-ch-1
remote1.sources.avro-src-1.type = avro
remote1.sources.avro-src-1.port = 10060
remote1.sources.avro-src-1.bind = 10.88.66.4 /* REPLACE WITH YOUR MACHINE'S EXTERNAL IP */

remote1.sinks.k1.channel = mem-ch-1
remote1.sinks.k1.type = hdfs
remote1.sinks.k1.hdfs.path = /user/ntimbadi/flume/
remote1.sinks.k1.hdfs.filePrefix = events-
remote1.sinks.k1.hdfs.round = true
remote1.sinks.k1.hdfs.roundValue = 10
remote1.sinks.k1.hdfs.roundUnit = minute

remote1.sources = avro-src-1
remote1.sinks = k1
remote1.channels = mem-ch-1

local1.channels.mem-ch-1.type = memory

local1.sources.exc-src-1.channels = mem-ch-1
local1.sources.exc-src-1.type = exec
local1.sources.exc-src-1.command = tail -F /(Path to file on my PC)

local1.sinks.avro-snk-1.channel = mem-ch-1
local1.sinks.avro-snk-1.type = avro
local1.sinks.avro-snk-1.hostname = 10.88.66.4 /* REPLACE WITH REMOTE IP */
local1.sinks.avro-snk-1.port = 10060

local1.sources = exc-src-1
local1.sinks = avro-snk-1
local1.channels = mem-ch-1

相关问题