flume和hdfs集成,hdfs io错误

c8ib6hqw  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(806)

我正在尝试将flume与hdfs集成,我的flume配置文件是

  1. hdfs-agent.sources= netcat-collect
  2. hdfs-agent.sinks = hdfs-write
  3. hdfs-agent.channels= memoryChannel
  4. hdfs-agent.sources.netcat-collect.type = netcat
  5. hdfs-agent.sources.netcat-collect.bind = localhost
  6. hdfs-agent.sources.netcat-collect.port = 11111
  7. hdfs-agent.sinks.hdfs-write.type = FILE_ROLL
  8. hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://127.0.0.1:50020/user/oracle/flume
  9. hdfs-agent.sinks.hdfs-write.rollInterval = 30
  10. hdfs-agent.sinks.hdfs-write.hdfs.writeFormat=Text
  11. hdfs-agent.sinks.hdfs-write.hdfs.fileType=DataStream
  12. hdfs-agent.channels.memoryChannel.type = memory
  13. hdfs-agent.channels.memoryChannel.capacity=10000
  14. hdfs-agent.sources.netcat-collect.channels=memoryChannel
  15. hdfs-agent.sinks.hdfs-write.channel=memoryChannel.

我的核心站点文件是

  1. <configuration>
  2. <property>
  3. <name>fs.default.name</name>
  4. <value>hdfs://localhost</value>
  5. </property>
  6. </configuration>

当我尝试运行flume代理时,它正在启动并且能够读取nc命令,但是在写入hdfs时,我得到了以下异常。我试着在安全模式下使用 hadoop dfsadmin -safemode leave 我还是有同样的例外。

  1. 2014-02-14 10:31:53,785 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:219)] Creating hdfs://127.0.0.1:50020/user/oracle/flume/FlumeData.1392354113707.tmp
  2. 2014-02-14 10:31:54,011 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:418)] HDFS IO error
  3. java.io.IOException: Call to /127.0.0.1:50020 failed on local exception: java.io.EOFException
  4. at org.apache.hadoop.ipc.Client.wrapException(Client.java:1089)
  5. at org.apache.hadoop.ipc.Client.call(Client.java:1057)
  6. at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
  7. at $Proxy5.getProtocolVersion(Unknown Source)
  8. at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:369)
  9. at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111)
  10. at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:213)
  11. at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:180)
  12. at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
  13. at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1489)
  14. at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
  15. at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1523)
  16. at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1505)
  17. at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:227)
  18. at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
  19. at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:226)
  20. at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:220)
  21. at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
  22. at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
  23. at org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
  24. at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
  25. at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  26. at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  27. at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  28. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  29. at java.lang.Thread.run(Thread.java:662)
  30. Caused by: java.io.EOFException
  31. at java.io.DataInputStream.readInt(DataInputStream.java:375)
  32. at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:781)
  33. at org.apache.hadoop.ipc.Client$Connection.run(Client.java:689)

如果您在任何属性文件中配置了错误的内容,请让我知道,这样它就可以工作了。
另外,请让我知道,如果我使用的是这个正确的端口
我的目标是集成flume和hadoop。我为hadoop设置了一个单节点服务器

m0rkklqb

m0rkklqb1#

必须提供带有fs.default.name的端口号
例子:

  1. <configuration>
  2. <property>
  3. <name>fs.default.name</name>
  4. <value>hdfs://localhost:9001</value>
  5. </property>
  6. </configuration>

之后,编辑flume配置文件如下

  1. hdfs-agent.sources= netcat-collect
  2. hdfs-agent.sinks = hdfs-write
  3. hdfs-agent.channels= memoryChannel
  4. hdfs-agent.sources.netcat-collect.type = netcat
  5. hdfs-agent.sources.netcat-collect.bind = localhost
  6. hdfs-agent.sources.netcat-collect.port = 11111
  7. hdfs-agent.sinks.hdfs-write.type = hdfs
  8. hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://127.0.0.1:9001/user/oracle/flume
  9. hdfs-agent.sinks.hdfs-write.rollInterval = 30
  10. hdfs-agent.sinks.hdfs-write.hdfs.writeFormat=Text
  11. hdfs-agent.sinks.hdfs-write.hdfs.fileType=DataStream
  12. hdfs-agent.channels.memoryChannel.type = memory
  13. hdfs-agent.channels.memoryChannel.capacity=10000
  14. hdfs-agent.sources.netcat-collect.channels=memoryChannel
  15. hdfs-agent.sinks.hdfs-write.channel=memoryChannel

变化:

  1. hdfs-agent.sinks.hdfs-write.type = hdfs(sink type as hdfs)
  2. hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://127.0.0.1:9001/user/oracle/flume(port number)
  3. hdfs-agent.sinks.hdfs-write.channel=memoryChannel(Removed the dot symbol after memoryChannel)
展开查看全部

相关问题