我看到了,在Kafka的zoopekeeper的例子中也是类似的问题。但这个问题仍然没有答案。
所以我的问题的扩展版本(有更多细节)
环境:业务应用有3个节点。每个应用程序都包含自己的1个zookeeper和1个kafka嵌入节点。
防止出现困惑的问题,我必须澄清。我的业务应用程序构建在elasticsearch之上,有3个节点,minimummasternodes=2,所以集群中我的应用程序的容错性是1。所以我假设,以同样的方式,我可以把它自己的zookeeper节点和kafka节点的示例放到每个应用程序中。总体目标是在这个堆栈的基础上,使用kafka mirrormaker(容错率为1)为业务应用程序构建数据中心间数据复制。
在我的实验中,我没有使用完整的商业应用程序堆栈,而是在每个应用程序节点中使用zookeeper+kafka。每个应用程序都会将自己的日志输出到控制台,这样我就可以确定哪个应用程序已经在leader模式下启动了zookeeper。
我的zookeeper ansemble配置是:
server.1=localhost:2668:3668
server.2=localhost:2669:3669
server.3=localhost:2670:3670
syncLimit=5
initLimit=10
clientPort=* #here each node has its own value of port number: 2182,2183,2184 for servers 1,2,3 accordingly
dataDir=D:\rtest\3-nodes\data\*\zoo # * is 1, 2, 3 accordingly to servers 1,2,3
dataLogDir=D:\rtest\3-nodes\data\*\zoo\log # * is 1, 2, 3 accordingly to servers 1,2,3
我的错误场景是:2.1。启动所有三个应用程序节点。启动使用者(控制台输出)。启动应用程序以生成消息序列。确保消费者通过kafka集群接收消息。2.2. 杀死zookeeper示例是leader的应用程序(在我的例子中是server#3)。2.3. 确保使用者不会从kafka主题输出任何新消息。
依我看,问题出在饲养员身上。以下是由活动节点1、2生成的日志摘录:看起来活动zookeeper服务器继续尝试访问丢弃的服务器,而不是在它们之间就仲裁达成一致。。。顺便说一句。在这种情况下,我甚至无法通过控制台clisent连接到zookeeper(更清楚地说,我可以连接到它,但在第一个命令,我们是否应该说“ls/”控制台客户端会异常地崩溃)
服务器1:
15459 [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2182] WARN org.apache.zookeeper.server.quorum.Learner - Exception when following the leader
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
15460 [Thread-3-SendThread(127.0.0.1:2184)] WARN org.apache.zookeeper.ClientCnxn - Session 0x354b9dbe0b90001 for server 127.0.0.1/127.0.0.1:2184, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15459 [Thread-3-SendThread(0:0:0:0:0:0:0:1:2184)] WARN org.apache.zookeeper.ClientCnxn - Session 0x354b9dbe0b90000 for server 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2184, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15459 [RecvWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Connection broken for id 3, my id = 1, error = java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)
15462 [RecvWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Interrupting SendWorker
15462 [SendWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Interrupted while waiting for message on queue java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)
15462 [SendWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Send worker leaving thread
15766 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182] WARN org.apache.zookeeper.server.NIOServerCnxn - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
16481 [WorkerSender[myid=1]] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Cannot open channel to 3 at election address localhost/127.0.0.1:3670
java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
at java.lang.Thread.run(Unknown Source)
16596 [Thread-3-SendThread(127.0.0.1:2184)] WARN org.apache.zookeeper.ClientCnxn - Session 0x354b9dbe0b90000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
...
服务器2:
...
5118 [RecvWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Connection broken for id 3, my id = 2, error =
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)
5121 [RecvWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Interrupting SendWorker
5120 [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2183] WARN org.apache.zookeeper.server.quorum.Learner - Exception when following the leader
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
5119 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183] WARN org.apache.zookeeper.server.NIOServerCnxn - Exception causing close of session 0x254b9dbe0b20000 due to java.io.IOException: An existing connect
ion was forcibly closed by the remote host
5122 [SendWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)
5123 [SendWorker:3] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Send worker leaving thread
5536 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183] WARN org.apache.zookeeper.server.NIOServerCnxn - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
6143 [WorkerSender[myid=2]] WARN org.apache.zookeeper.server.quorum.QuorumCnxManager - Cannot open channel to 3 at election address localhost/127.0.0.1:3670
java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
at java.lang.Thread.run(Unknown Source)
....
顺便说一句。由于我的要求,4个这样的节点的安萨布勒工作得非常完美。那么大家能回答吗,如果3个节点的zookeeper集群死了一个节点后还能存活?还是我做错了什么?
1条答案
按热度按时间p8h8hvxi1#
3个节点的集群可以损失1个,5个节点的集群可以损失2个。这里也提出了类似的问题:zookeeper的可靠性——三个节点对五个节点