我正在使用flume-ng-1.2.0和cdh3u5。我只是尝试从文本文件中提取数据并将其放入hdfs中。以下是我使用的配置:
agent1.sources = tail1
agent1.channels = Channel-2
agent1.sinks = HDFS
agent1.sources.tail1.type = exec
agent1.sources.tail1.command = tail -F /usr/games/sample1.txt
agent1.sources.tail1.channels = Channel-2
agent1.sinks.HDFS.channel = Channel-2
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://10.12.1.2:8020/user/hdfs/flume
agent1.sinks.HDFS.hdfs.fileType = DataStream
agent1.channels.Channel-2.type = memory
agent1.channels.Channel-2.capacity = 1000
我在管理代理 bin/flume-ng agent -n agent1 -c ./conf/ -f conf/flume.conf
我得到的日志是
2012-10-11 12:10:36,626 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
2012-10-11 12:10:36,631 INFO node.FlumeNode: Flume node starting - agent1
2012-10-11 12:10:36,639 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
2012-10-11 12:10:36,639 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 12
2012-10-11 12:10:36,641 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
2012-10-11 12:10:36,646 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:conf/flume.conf
2012-10-11 12:10:36,657 INFO conf.FlumeConfiguration: Processing:HDFS
2012-10-11 12:10:36,670 INFO conf.FlumeConfiguration: Processing:HDFS
2012-10-11 12:10:36,670 INFO conf.FlumeConfiguration: Processing:HDFS
2012-10-11 12:10:36,670 INFO conf.FlumeConfiguration: Processing:HDFS
2012-10-11 12:10:36,671 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: agent1
2012-10-11 12:10:36,758 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
2012-10-11 12:10:36,758 INFO properties.PropertiesFileConfigurationProvider: Creating channels
2012-10-11 12:10:36,800 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: CHANNEL, name: Channel-2, registered successfully.
2012-10-11 12:10:36,800 INFO properties.PropertiesFileConfigurationProvider: created channel Channel-2
2012-10-11 12:10:36,835 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs
2012-10-11 12:10:37,753 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
2012-10-11 12:10:37,896 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: SINK, name: HDFS, registered successfully.
2012-10-11 12:10:37,899 INFO nodemanager.DefaultLogicalNodeManager: Starting new configuration:{ sourceRunners:{tail1=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@362f0d54 }} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4b142196 counterGroup:{ name:null counters:{} } }} channels:{Channel-2=org.apache.flume.channel.MemoryChannel@16a9255c} }
2012-10-11 12:10:37,900 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel Channel-2
2012-10-11 12:10:37,901 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: Channel-2 started
2012-10-11 12:10:37,901 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink HDFS
2012-10-11 12:10:37,905 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started
2012-10-11 12:10:37,910 INFO nodemanager.DefaultLogicalNodeManager: Starting Source tail1
2012-10-11 12:10:37,912 INFO source.ExecSource: Exec source starting with command:tail -F /usr/games/sample1.txt
我不知道我哪里做错了。由于我是一个初学者,我没有得到任何在hdfs和flume代理一直在运行。任何建议和更正都会对我很有帮助,谢谢。
1条答案
按热度按时间aydmsdu91#
一个问题是你已经设定了
agent1.sinks.HDFS.hdfs.file.Type = DataStream
但财产是hdfs.fileType
--看到了吗https://flume.apache.org/flumeuserguide.html#hdfs-了解更多信息。我会试着用一个Flume--
sink.type = logger
--只是想看看有没有什么进展。也要确保你运行时得到一些东西tail -F
从你的外壳发出命令。还有一件事,这可能是一个红鲱鱼:有一个倒勾(`)在您的日志消息的结尾。也许这是一个粘贴错误,但如果不是,那么如果这是在您的配置文件,我不会感到惊讶,如果它造成了麻烦。我所指的消息来自您日志中的最后一行: