我使用clouderacdh5.8.0创建了一个包含一个主节点和三个从节点的集群。经过一些配置工作,我得到了所有的服务健康,但有一个:hbase。重新启动几分钟后,它的健康状况就不好了。
cloudera manager显示的错误是:“bad:master summary:此运行状况测试是错误的,因为服务监视器没有找到活动的主机”。我检查了服务监视器日志,发现以下警告:
(7 skipped) Exception in doWork for task: hbase_HBASE_SERVICE_STATE_TASK
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions:
Thu Dec 15 09:38:30 CET 2016, RpcRetryingCaller{globalStartTime=1481791110299, pause=100, retries=1}, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2303)
at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:782)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55652)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3678)
at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2382)
at com.cloudera.cmf.cdh5client.hbase.HConnectionImpl.getClusterStatus(HConnectionImpl.java:50)
at com.cloudera.cmon.firehose.polling.hbase.HbaseServiceState.update(HbaseServiceState.java:158)
at com.cloudera.cmon.firehose.polling.hbase.HbaseServiceStateFetcher.doWork(HbaseServiceStateFetcher.java:42)
at com.cloudera.cmon.firehose.polling.AbstractHConnectionClientTask.doWorkWithClientConfig(AbstractHConnectionClientTask.java:95)
at com.cloudera.cmon.firehose.polling.AbstractHConnectionClientTask.doWorkWithClientConfig(AbstractHConnectionClientTask.java:26)
at com.cloudera.cmon.firehose.polling.AbstractCdhWorkUsingClientConfigs.doWork(AbstractCdhWorkUsingClientConfigs.java:45)
at com.cloudera.cmon.firehose.polling.CdhTask$InstrumentedWork.doWork(CdhTask.java:230)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.runTask(ImpersonatingTaskWrapper.java:72)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.access$000(ImpersonatingTaskWrapper.java:21)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper$1.run(ImpersonatingTaskWrapper.java:107)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at com.cloudera.cmf.cdh5client.security.UserGroupInformationImpl.doAs(UserGroupInformationImpl.java:41)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.doWork(ImpersonatingTaskWrapper.java:103)
at com.cloudera.cmf.cdhclient.CdhExecutor$1.call(CdhExecutor.java:125)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2303)
at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:782)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55652)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1219)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:46458)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$5.getClusterStatus(ConnectionManager.java:2027)
at org.apache.hadoop.hbase.client.HBaseAdmin$28.call(HBaseAdmin.java:2386)
at org.apache.hadoop.hbase.client.HBaseAdmin$28.call(HBaseAdmin.java:2382)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 22 more
有解决这个问题的方法吗?我´已将主节点设置为hbase master,将从节点设置为hbase regionservers。
2条答案
按热度按时间fv2wmkja1#
问题在于cloudera管理监控服务,而不是hbase本身。我所做的是重新启动cloudera管理监视器服务,然后重新启动hbase。在那之后一切似乎都很好。
ckocjqey2#
检查hbase主机的日志文件。在我的例子中,hbase没有足够的权限在hdfs上操作“/tmp”目录,所以我更改了它,问题就消失了。