flume脚本发出警告:未设置配置目录!使用--conf< dir>覆盖

agyaoht7  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(1807)

这是我的配置文件下面它以前是工作,但后来突然给错误。实际上,我要做的是将所有日志从本地移动到hdfs日志应该作为一个文件移动到hdfs,而不是作为一个片段:


# create source, channels, and sink

agent1.sources=S1
agent1.sinks=H1
agent1.channels=C1

# bind the source and sink to the channel

agent1.sources.S1.channels=C1
agent1.sinks.H1.channel=C1

# Specify the source type and directory

agent1.sources.S1.type=spooldir
agent1.sources.S1.spoolDir=/tmp/spooldir

# Specify the Sink type, directory, and parameters

agent1.sinks.H1.type=HDFS
agent1.sinks.H1.hdfs.path=/user/hive/warehouse
agent1.sinks.H1.hdfs.filePrefix=events
agent1.sinks.H1.hdfs.fileSuffix=.log
agent1.sinks.H1.hdfs.inUsePrefix=processing
A1.sinks.H1.hdfs.fileType=DataStream

# Specify the channeltyoe (Memory vs File)

agent1.channels.C1.type=file

我通过以下脚本运行代理:

flume-ng agent --conf-file /usr/local/flume/conf/spoolingToHDFS.conf --name agent1

然后我得到这个警告:

Warning: No configuration directory set! Use --conf <dir> to override.

16/10/14 16:22:37 WARN conf.FlumeConfiguration: Agent configuration for 'A1' does not contain any channels. Marking it as invalid.
16/10/14 16:22:37 WARN conf.FlumeConfiguration: Agent configuration invalid for agent 'A1'. It will be removed.

然后重命名,创建并关闭同一个日志到hdfs,就像这样:

16/10/14 16:22:41 INFO node.Application: Starting Sink H1
16/10/14 16:22:41 INFO node.Application: Starting Source S1
16/10/14 16:22:41 INFO source.SpoolDirectorySource: SpoolDirectorySource source starting with directory: /tmp/spooldir
16/10/14 16:22:41 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: H1: Successfully registered new MBean.
16/10/14 16:22:41 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: H1 started
16/10/14 16:22:41 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: S1: Successfully registered new MBean.
16/10/14 16:22:41 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: S1 started
16/10/14 16:22:41 INFO hdfs.HDFSSequenceFile: writeFormat = Writable, UseRawLocalFileSystem = false
16/10/14 16:22:42 INFO hdfs.BucketWriter: Creating /user/hive/warehouse/processingevents.1476476561961.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Closing /user/hive/warehouse/processingevents.1476476561961.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Renaming /user/hive/warehouse/processingevents.1476476561961.log.tmp to /user/hive/warehouse/events.1476476561961.log
16/10/14 16:22:44 INFO hdfs.BucketWriter: Creating /user/hive/warehouse/processingevents.1476476561962.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Closing /user/hive/warehouse/processingevents.1476476561962.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Renaming /user/hive/warehouse/processingevents.1476476561962.log.tmp to /user/hive/warehouse/events.1476476561962.log
16/10/14 16:22:44 INFO hdfs.BucketWriter: Creating /user/hive/warehouse/processingevents.1476476561963.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Closing /user/hive/warehouse/processingevents.1476476561963.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Renaming /user/hive/warehouse/processingevents.1476476561963.log.tmp to /user/hive/warehouse/events.1476476561963.log
16/10/14 16:22:44 INFO hdfs.BucketWriter: Creating /user/hive/warehouse/processingevents.1476476561964.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Closing /user/hive/warehouse/processingevents.1476476561964.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Renaming /user/hive/warehouse/processingevents.1476476561964.log.tmp to /user/hive/warehouse/events.1476476561964.log
16/10/14 16:22:44 INFO hdfs.BucketWriter: Creating /user/hive/warehouse/processingevents.1476476561965.log.tmp
16/10/14 16:22:44 INFO hdfs.BucketWriter: Closing /user/hive/warehouse/processingevents.1476476561965.log.tmp
:
:
:

为什么flume总是将同一个文件写入hdfs,我如何将一个日志从本地移动到hdfs而不将它们拆分成部分,因为我的日志大小通常在50kb到300kb之间。
更新警告:

16/10/18 10:10:05 INFO tools.DirectMemoryUtils: Unable to get maxDirectMemory from VM: NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)

16/10/18 10:10:05 WARN file.ReplayHandler: Ignoring /home/USER/.flume/file-channel/data/log-18 due to EOF
java.io.EOFException
    at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827)
    at org.apache.flume.channel.file.LogFileFactory.getSequentialReader(LogFileFactory.java:169)
    at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:264)
    at org.apache.flume.channel.file.Log.doReplay(Log.java:529)
    at org.apache.flume.channel.file.Log.replay(Log.java:455)
    at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:295)
    at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
a5g8bdjr

a5g8bdjr1#

flume使用conf文件夹从中提取jre和日志属性,您可以使用 --conf 如前所述:

flume-ng agent --conf /usr/local/flume/conf --conf-file /usr/local/flume/conf/spoolingToHDFS.conf --name agent1

关于…的警告 A1 是因为您的代理配置文件末尾可能有一个输入错误: A1.sinks.H1.hdfs.fileType=DataStream 应该是 agent1.sinks.H1.hdfs.fileType=DataStream 至于文件-您还没有为spooldir源配置反序列化程序,默认值是line,因此您将为spooldir中的文件中的每一行获取一个hdfs文件。如果希望flume将整个文件用作单个事件,则需要使用blobdeserializer(https://flume.apache.org/flumeuserguide.html#blobdeserializer)

agent1.sources.S1.deserializer=org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder

相关问题