带有thdfsput的简单talend作业失败,连接被拒绝

6ovsh4lw  于 2021-06-03  发布在  Hadoop
关注(0)|答案(2)|浏览(471)

我创建了一个talend作业,用它将一个简单的文本文件(webapp.log,50kb)上传到hdfs tHDFSPut . 目录和文件是在hdfs中成功创建的,但不知何故我得到了一个 Connection refused: no further information 在上传文件内容的过程中(见最后的日志),所以上传失败(即创建的文件仍然是空的)。
hdfs在cloudera快速启动vm中运行。thdfscomponent中的用户名设置为“cloudera”,应该可以使用,因为它是默认配置的一部分。
有什么建议为什么我会被拒绝?
我试了什么
cloudera经理说 hdfs1 他身体很好。
namenode(在默认端口50070上)已启动并正在运行。
namenode报告单个数据节点(在cloudera vm中预先配置)处于活动状态,并且dfs有超过23gb的剩余可用空间。
在虚拟机中上载相同的文件 hadoop fs -copyFromLocal 成功。
设置
针对运行hdfs的vmware的cloudera快速启动vm
运行talend studio的本地计算机,用于执行作业。
作业日志
第一个i/o异常似乎被输出到错误控制台(可以通过运行视图中的红色字体识别)。我编辑了日志,这样就可以在一块中读取这个特定stacktrace的行。
我在pastebin上保留了原始日志以供参考:http://pastebin.com/n3ztkdlw

Starting job Test1 at 10:57 14/10/2013.

[statistics] connecting to socket on port 3762
[statistics] connected
[main           ] Configuration                  WARN  fs.default.name is deprecated. Instead, use fs.defaultFS
[Thread-4       ] DFSClient                      INFO  Exception in createBlockOutputStream
java.net.ConnectException: Connection refused: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:524)
    at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1181)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1026)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:989)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
[Thread-4       ] DFSClient                      INFO  Abandoning block BP-780931682-127.0.0.1-1381159027878:blk_-5532547656644348116_1280
[Thread-4       ] DFSClient                      INFO  Excluding datanode 127.0.0.1:50010
Exception in component tHDFSPut_1
java.io.IOException: File /user/cloudera/weblog/webapp.log could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1161)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:285)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1104)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:979)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
[Thread-4       ] DFSClient                      WARN  DataStreamer Exception
java.io.IOException: File /user/cloudera/weblog/webapp.log could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)

    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1161)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:285)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1104)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:979)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
[statistics] disconnected
[Thread-2       ] DFSClient                      ERROR Failed to close file /user/cloudera/weblog/webapp.log
java.io.IOException: File /user/cloudera/weblog/webapp.log could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1161)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:285)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1104)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:979)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
Job Test1 ended at 10:58 14/10/2013. [exit code=1]
l7mqbcuq

l7mqbcuq1#

我对quickstart cloudera图像也有类似的问题。我试图通过javaapi将文件从主机放到guest cloudera快速启动机器上。问题是virtualbox机器将8020 namenode端口从主机转发到客户机,但没有将50010 datanode的端口转发到客户机。尝试添加50010转发,它将工作。

ffdz8vbo

ffdz8vbo2#

在hosts文件中为您尝试连接的hadoop namenode的ip地址添加一个条目,并在hdfs connection参数中使用hostname而不是ip address。
在windows中,hosts文件的位置是open win+r,键入%systemroot%\system32\drivers\etc\hosts,然后输入ip和hostname。
例如:192.168.1.1主机名

相关问题