hadoop数据节点启动失败

f45qwnt8  于 2021-05-30  发布在  Hadoop
关注(0)|答案(4)|浏览(480)

我有一个由11个节点组成的hadoop集群。一个节点充当主节点,10个从节点运行datanode和tasktrackers。
在所有从属节点上启动任务跟踪器。datanode只启动了以下10个节点中的6个是来自 /hadoop/logs/...Datanode....log 文件。

  1. 2014-12-03 17:55:05,057 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
  2. /************************************************************
  3. STARTUP_MSG: Starting DataNode
  4. STARTUP_MSG: host = trans9/192.168.0.16
  5. STARTUP_MSG: args = []
  6. STARTUP_MSG: version = 1.2.1
  7. STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r
  8. 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
  9. STARTUP_MSG: java = 1.7.0_65
  10. ************************************************************/
  11. 2014-12-03 17:55:05,371 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
  12. 2014-12-03 17:55:05,384 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
  13. 2014-12-03 17:55:05,385 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
  14. 2014-12-03 17:55:05,385 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
  15. 2014-12-03 17:55:05,776 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
  16. 2014-12-03 17:55:05,789 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
  17. 2014-12-03 17:55:08,850 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
  18. 2014-12-03 17:55:08,865 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened data transfer server at 50010
  19. 2014-12-03 17:55:08,867 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
  20. 2014-12-03 17:55:08,876 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
  21. 2014-12-03 17:55:08,962 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
  22. 2014-12-03 17:55:09,055 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
  23. 2014-12-03 17:55:09,068 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false
  24. 2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075
  25. 2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
  26. 2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
  27. 2014-12-03 17:55:09,068 INFO org.mortbay.log: jetty-6.1.26
  28. 2014-12-03 17:55:09,804 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
  29. 2014-12-03 17:55:09,812 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
  30. 2014-12-03 17:55:09,813 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source DataNode registered.
  31. 2014-12-03 17:55:09,893 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort50020 registered.
  32. 2014-12-03 17:55:09,894 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort50020 registered.
  33. 2014-12-03 17:55:09,895 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
  34. 2014-12-03 17:55:09,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(slave9:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020)
  35. 2014-12-03 17:55:09,914 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished generating blocks being written report for 1 volumes in 0 seconds
  36. 2014-12-03 17:55:09,933 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished asynchronous block report scan in 5ms
  37. 2014-12-03 17:55:09,933 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/home/ubuntu/hadoop/hadoop-data/dfs/data/current'}
  38. 2014-12-03 17:55:09,945 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
  39. 2014-12-03 17:55:09,946 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
  40. 2014-12-03 17:55:09,946 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
  41. 2014-12-03 17:55:09,955 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting
  42. 2014-12-03 17:55:09,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
  43. 2014-12-03 17:55:09,959 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
  44. 2014-12-03 17:55:10,140 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 192.168.0.16:50010 is attempting to report storage ID DS-551911532-192.168.0.31-50010-1417617118848. Node 192.168.0.14:50010 is expected to serve this storage.
  45. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:5049)
  46. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:3939)
  47. at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:1095)
  48. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  49. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  50. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  51. at java.lang.reflect.Method.invoke(Method.java:606)
  52. at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
  53. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
  54. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
  55. at java.security.AccessController.doPrivileged(Native Method)
  56. at javax.security.auth.Subject.doAs(Subject.java:415)
  57. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
  58. at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
  59. at org.apache.hadoop.ipc.Client.call(Client.java:1113)
  60. at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
  61. at com.sun.proxy.$Proxy3.blockReport(Unknown Source)
  62. at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1084)
  63. at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1588)
  64. at java.lang.Thread.run(Thread.java:745)
  65. 2014-12-03 17:55:10,144 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:50075
  66. 2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020
  67. 2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: exiting
  68. 2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: exiting
  69. 2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: exiting
  70. 2014-12-03 17:55:10,148 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50020
  71. 2014-12-03 17:55:10,148 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
  72. 2014-12-03 17:55:10,149 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
  73. 2014-12-03 17:55:10,149 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020):DataXceiveServer:java.nio.channels.AsynchronousCloseException
  74. at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
  75. at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:248)
  76. at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:100)
  77. at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:132)
  78. at java.lang.Thread.run(Thread.java:745)
  79. 2014-12-03 17:55:10,149 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting DataXceiveServer
  80. 2014-12-03 17:55:10,149 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0
  81. 2014-12-03 17:55:10,150 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
  82. 2014-12-03 17:55:10,151 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down
  83. 2014-12-03 17:55:10,151 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020):Finishing DataNode in: FSDataset{dirpath='/home/ubuntu/hadoop/hadoop-data/dfs/data/current'}
  84. 2014-12-03 17:55:10,152 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=DataNodeInfo
  85. javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=DataNodeInfo
  86. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
  87. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
  88. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
  89. at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
  90. at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
  91. at org.apache.hadoop.hdfs.server.datanode.DataNode.unRegisterMXBean(DataNode.java:586)
  92. at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:855)
  93. at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1601)
  94. at java.lang.Thread.run(Thread.java:745)
  95. 2014-12-03 17:55:10,152 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020
  96. 2014-12-03 17:55:10,152 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
  97. 2014-12-03 17:55:10,153 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0
  98. 2014-12-03 17:55:10,153 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=FSDatasetState-DS-551911532-192.168.0.31-50010-1417617118848
  99. javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-DS-551911532-192.168.0.31-50010-1417617118848
  100. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
  101. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
  102. at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
  103. at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
  104. at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
  105. at org.apache.hadoop.hdfs.server.datanode.FSDataset.shutdown(FSDataset.java:2093)
  106. at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:917)
  107. at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1601)
  108. at java.lang.Thread.run(Thread.java:745)
  109. 2014-12-03 17:55:10,159 WARN org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: AsyncDiskService has already shut down.
  110. 2014-12-03 17:55:10,159 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
  111. 2014-12-03 17:55:10,166 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
  112. /************************************************************
  113. SHUTDOWN_MSG: Shutting down DataNode at trans9/192.168.0.16
  114. ************************************************************/
wsxa1bj1

wsxa1bj11#

因为您的集群有几十个节点,所以使用一个节点就可以了 Namenode 以及 JobTracker . Secondary Namenode 应该在另一个节点上,因为执行定期备份需要更多内存。
说到你的问题,
可能是配置文件复制导致冲突。此处回答了相同的问题
您可以尝试通过适当的更改复制工作数据节点配置。
如果集群上没有数据,可以格式化namenode,在此之前停止所有守护进程,然后重新启动。
希望有帮助!

aiqt4smr

aiqt4smr2#

这是由于不兼容的集群id问题造成的,所以请格式化datanode目录并重新开始。

uubf1zoe

uubf1zoe3#

解决办法如下。这不会导致任何数据删除,因为您会将复制因子设置为大于1。解决方法是检查盒子上运行的服务。在这种情况下,您将 tasktracker 在箱子上跑。通过运行停止任务跟踪器 hadoop-daemon stop tasktracker 完成后,转到您可能在下面提到的位置 dfs.data.dir 属性并删除此处的所有文件\文件夹。完成后,运行
hadoop-daemon start datanode hadoop-daemon start tasktracker 这将启动datanode和tasktracker。如果您成功地做到了这一点,请转到namenode并运行 hadoop dfsadmin -refreshNodes

ndasle7k

ndasle7k4#

哦,一个系统的名称节点和其他系统分配给数据节点是不推荐的。我推荐一个系统用于name node,另一个系统用于job tracker,另一个系统用于secondary name node。其余的8个系统应该数据节点是强烈建议。
来回答你的问题吧。格式化名称节点以解决此问题。同时重新打开终端。有的时候网络也是原因之一。

相关问题