hadoop 2节点群集ui显示1个活动节点

rks48beu  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(422)

我正在尝试配置hadoop 2.7 2节点群集。当我使用start-dfs.sh和start-yarn.sh启动hadoop时,主服务器和从服务器上的所有服务都会完美启动。

Here is my jps command on  my master:
23913 Jps
22140 SecondaryNameNode
22316 ResourceManager
22457 NodeManager
21916 DataNode
21777 NameNode

Here is my jps command on  my slave:
17223 Jps
14225 DataNode
14363 NodeManager

但如果我看到hadoop集群ui,它只显示1个活动数据节点。

Here is the dfs admin report : /bin/hdfs dfsadmin -report
Live datanodes (1):

Name: 192.168.1.104:50010 (nn1.cluster.com)
Hostname: nn1.cluster.com
Decommission Status : Normal
Configured Capacity: 401224601600 (373.67 GB)
DFS Used: 237568 (232 KB)
Non DFS Used: 48905121792 (45.55 GB)
DFS Remaining: 352319242240 (328.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 87.81%

我可以在所有机器上使用ssh。

Here is the sample of name node logs(i.p = 192.168.1.104) : 
2016-07-12 01:17:34,293 INFO BlockStateChange: BLOCK* processReport: from storage DS-d9ed40cf-bd5d-4033-a6ca-14fb4a8c3587 node DatanodeRegistration(192.168.1.104:50010, datanodeUuid=b702b518-5daa-4fa1-8e69-e4d620a72470, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-e86d0353-9f33-495b-88fa-16035abd3672;nsid=616310490;c=0), blocks: 24, hasStaleStorage: false, processing time: 0 msecs
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* registerDatanode: from DatanodeRegistration(192.168.1.104:50010, datanodeUuid=37038a9f-23ac-42e2-abea-bdf356aaefbe, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-e86d0353-9f33-495b-88fa-16035abd3672;nsid=616310490;c=0) storage 37038a9f-23ac-42e2-abea-bdf356aaefbe
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: BLOCK* registerDatanode: 192.168.1.104:50010
2016-07-12 01:17:35,501 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/192.168.1.104:50010
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0
2016-07-12 01:17:35,502 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.1.104:50010
2016-07-12 01:17:35,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0
2016-07-12 01:17:35,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Adding new storage ID DS-495b6b0e-f1fc-407c-bb9f-6c314c2fdaec for DN 192.168.1.104:50010

here is the sample datanode logs (i.p = 192.168.1.104): 
2016-07-12 02:02:12,044 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state
2016-07-12 02:02:12,045 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN
2016-07-12 02:02:12,047 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN
2016-07-12 02:02:12,050 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0x236119eb3082,  containing 1 storage report(s), of which we sent 1. The reports had 24 total blocks and used 1 RPC(s). This took 0 msec to generate and 1 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-07-12 02:02:12,050 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934
2016-07-12 02:02:15,049 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state
2016-07-12 02:02:15,052 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN
2016-07-12 02:02:15,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN
2016-07-12 02:02:15,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0x2361cd4be40d,  containing 1 storage report(s), of which we sent 1. The reports had 24 total blocks and used 1 RPC(s). This took 0 msec to generate and 2 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-07-12 02:02:15,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934

Here is the sample of 2nd datanode logs(ip :192.168.35.128)
2016-07-12 11:45:07,346 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state
2016-07-12 11:45:07,349 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN
2016-07-12 11:45:07,355 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN
2016-07-12 11:45:07,364 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb0de42ec7c,  containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 4 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-07-12 11:45:07,364 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934
2016-07-12 11:45:10,360 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state
2016-07-12 11:45:10,363 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN
2016-07-12 11:45:10,370 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN
2016-07-12 11:45:10,377 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb191ea9cb9,  containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 3 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-07-12 11:45:10,377 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934
2016-07-12 11:45:13,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state
2016-07-12 11:45:13,380 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN
2016-07-12 11:45:13,385 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN
2016-07-12 11:45:13,395 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb245b893c4,  containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 5 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-07-12 11:45:13,396 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934

为什么会这样?非常感谢你的帮助!

cwtwac6a

cwtwac6a1#

找到解决办法了。如果从属节点/数据节点单独处于活动状态,并且在报告中,如果报告hadoop dfsadmin-report…上没有显示它,则存在通信问题。从数据节点到主节点的通信不可用。从技术上讲,防火墙是有问题的。主节点中的防火墙正在阻止通信。
“我们必须停止主机中的防火墙连接”/允许特定ip访问主机”
要在centos中停止防火墙,请在centos中执行以下命令

service iptables save
service iptables stop
chkconfig iptables off
xfyts7mz

xfyts7mz2#

找到解决办法了。问题出现在namenode ipaddress中。我曾考虑过wlan0接口的IP地址不断变化。因为我安装了vmware workstation,所以考虑过vmnet接口的IP地址,该接口是静态IP地址,并且在更改live node后显示2而不是1。

相关问题