hadoop中只有三分之一的数据节点启动?

3gtaxfhh  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(261)

我有一个主集群和三个从集群。没有虚拟环境。全在网络里。在namenode中执行start-dfs.sh之后,我看到namenode和一个datanode正确启动。但是其他两个datanode给出了以下异常(除了日志中ip地址的更改)。任何时候,三个数据节点中只有一个工作。
在停止和启动集群之后,我看到在任何给定的时间,只有一个从节点(datanode)在运行。但是跑步似乎是随机的。有时我有奴隶1,有时奴隶2和有时奴隶3运行。
我不太清楚这里出了什么问题。但是我看到一个未注册的nodeexception出现在启动失败的datanodoes的日志文件中
以下是错误消息:

2014-06-26 13:00:54,866 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
2014-06-26 13:00:54,870 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2014-06-26 13:00:54,871 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source DataNode registered.
2014-06-26 13:00:54,893 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2014-06-26 13:00:54,894 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort50020 registered.
2014-06-26 13:00:54,894 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort50020 registered.
2014-06-26 13:00:54,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(dig14.baker:50010, storageID=DS-1161815082-172.16.30.208-50010-1403746077641, infoPort=50075, ipcPort=50020)
2014-06-26 13:00:54,906 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished generating blocks being written report for 1 volumes in 0 seconds
2014-06-26 13:00:54,910 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 172.16.130.214:50010 is attempting to report storage ID DS-1161815082-172.16.30.208-50010-1403746077641. Node 172.16.130.213:50010 is expected to serve this storage.
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:5049)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processBlocksBeingWrittenReport(FSNamesystem.java:3869)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.blocksBeingWrittenReport(NameNode.java:1109)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

这是我的 core-site.xml :

<configuration>
<property>
<name>fs.default.name</name>
 <value>hdfs://dig11:10001</value>
<!-- <value>hdfs://localhost:10001</value> -->
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/scratch/USERS/sridhar/hadoop</value>
</property>
</configuration>
wtzytmuj

wtzytmuj1#

一个datanode上的数据被另一个datanode呈现,namenode拒绝了它。
如果它是一个测试环境,并且您愿意丢失hdfs中的任何现有数据,那么您可以清理datanode目录和namenode目录并重新格式化dfs。
删除数据库中的所有现有数据 dfs.name.dir , dfs.namenode.name.dir .
最后用 hdfs namenode -format

相关问题