从本地非hadoop机器上传数据到运行在amazonec2中的hdfs

2mbi3lxu  于 2021-06-03  发布在  Hadoop
关注(0)|答案(2)|浏览(340)

我在amazonec2上建立了一个由两个节点组成的hadoop集群。效果很好。我可以使用hadoopapi(附带java程序)将数据从主节点或与hadoop集群位于同一amazon区域的其他示例上传到hdfs。
但是,当我想在本地非hadoop机器上执行此操作时,会出现以下异常:
然后我登录到hadoop namenode以检查命令行。已创建文件夹“testdir”,但上载文件“myfile”的大小为0。
===================这是分隔符===============================
这些都是例外

Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream 10.196.153.215:50010 java.net.ConnectException: Connection timed out
Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_560654195674249927_1002
Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode 10.196.153.215:50010
Apr 18, 2013 10:41:09 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream 10.195.171.154:50010 java.net.ConnectException: Connection timed out
Apr 18, 2013 10:41:09 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_1747509888999401559_1002
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode 10.195.171.154:50010
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/testdir/myfile could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3510)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3373)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2589)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2829)

Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream processDatanodeError
WARNING: Error Recovery for block blk_1747509888999401559_1002 bad datanode[0] nodes == null
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream processDatanodeError
WARNING: Could not get block locations. Source file "/user/ubuntu/testdir/myfile" - Aborting...
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/testdir/myfile could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3510)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3373)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2589)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2829)

===================这是分隔符===============================
这是我的java代码

Path output = new Path("testdir");
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://ec2-23-22-12-173.compute-1.amazonaws.com:9000");
conf.set("hadoop.job.user",ubuntu);

FileSystem.mkdirs(FileSystem.get(conf), output, FsPermission.valueOf("drwxr-xr-x"));
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(new Path("./myfile"), output);

======================这是分隔符======================================ps。我已在安全组中打开端口9000、50010并已关闭linux防火墙。
有人有什么想法吗?
谢谢。

vlju58qv

vlju58qv1#

这个错误背后可能有几个原因:1-datanodes没有启动和运行。确保不是这样。如果没有得到任何信息,请尝试挖掘每个服务器上的dn日志。
2-运行dns的计算机上的空间小于您通过“dfs.datanode.du.reserved”属性指定的空间。
3-dn机器上实际上没有剩余空间。
4-在hdfs-site.xml文件中由“dfs.data.dir”指定的路径没有剩余空间(可能作为dfs.data.dir的磁盘已用完空间)。
5-dns无法向nn发送心跳/块报告。确保没有与网络相关的问题。
hth公司

zwghvu4y

zwghvu4y2#

你找到这个问题的答案了吗。。。如果不是,这里是潜在的“原因”==>您的客户机正试图从ec2上的私有ip地址(仅对集群可见)而不是公共ip访问数据节点。您可以查看错误日志来验证:排除datanode私有ip不是公共ip,但是我不知道如何克服这个问题。我也有同样的问题。有关详细信息,请查看以下链接:http://www.hadoopinrealworld.com/could-only-be-replicated-to-0-nodes/

相关问题