使用uu consumer u偏移量杀死节点导致在consumer处没有消息消耗

7qhs6swi  于 2021-06-08  发布在  Kafka
关注(0)|答案(1)|浏览(193)

我有3个节点(nodes0,node1,node2)kafka集群(broker0,broker1,broker2),复制因子为2,zookeeper(使用kafka tar打包的zookeeper)运行在不同的节点(node 4)上。
在启动zookeper和其他节点之后,我启动了broker 0。在代理0日志中可以看到,它正在读取\uu消费者\u偏移量,似乎它们存储在代理0上。以下是日志示例:
Kafka版本:Kafka2.10-0.10.2.0

2017-06-30 10:50:47,381] INFO [GroupCoordinator 0]: Loading group metadata for console-consumer-85124 with generation 2 (kafka.coordinator.GroupCoordinator)
    [2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-41 in 23 milliseconds. (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-44 (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-44 in 5 milliseconds. (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-47 (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-47 in 11 milliseconds. (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-1 (kafka.coordinator.GroupMetadataManager)

另外,我可以在相同的broker 0日志中看到groupcoordinator消息。

[2017-06-30 14:35:22,874] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-34472 with old generation 1 (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:35:22,877] INFO [GroupCoordinator 0]: Group console-consumer-34472 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-6612 with old generation 1 (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Group console-consumer-6612 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-30165 with old generation 1 (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Group console-consumer-30165 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
    [2017-06-30 14:43:15,656] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 3 milliseconds. (kafka.coordinator.GroupMetadataManager)
    [2017-06-30 14:53:15,653] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)

在使用kafka-console-consumer.sh和kafka-console-producer.sh测试集群的容错性时,我发现在终止broker 1或broker 2时,使用者仍然可以接收来自producer的新消息。重新平衡是正确的。
但是,杀死代理0不会导致任何数量的使用者使用新的或旧的消息。下面是终止代理0之前和之后主题的状态。
之前

Topic:test-topic    PartitionCount:3    ReplicationFactor:2 Configs:
    Topic: test-topic   Partition: 0    Leader: 2   Replicas: 2,0   Isr: 0,2
    Topic: test-topic   Partition: 1    Leader: 0   Replicas: 0,1   Isr: 0,1
    Topic: test-topic   Partition: 2    Leader: 1   Replicas: 1,2   Isr: 1,2

之后

Topic:test-topic    PartitionCount:3    ReplicationFactor:2 Configs:
    Topic: test-topic   Partition: 0    Leader: 2   Replicas: 2,0   Isr: 2
    Topic: test-topic   Partition: 1    Leader: 1   Replicas: 0,1   Isr: 1
    Topic: test-topic   Partition: 2    Leader: 1   Replicas: 1,2   Isr: 1,2

以下是终止代理0后在使用者日志中看到的警告消息

[2017-06-30 14:19:17,155] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-34472: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-30 14:19:10,542] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-30165: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

代理属性。其余的默认属性保持不变。

broker.id=0
delete.topic.enable=true

auto.create.topics.enable=false
listeners=PLAINTEXT://XXX:9092
advertised.listeners=PLAINTEXT://XXX:9092
log.dirs=/tmp/kafka-logs-test1
num.partitions=3
zookeeper.connect=XXX:2181

生产者财产。其余的默认属性保持不变。

bootstrap.servers=XXX,XXX,XXX
compression.type=snappy

消费者财产。其余的默认属性保持不变。

zookeeper.connect=XXX:2181
zookeeper.connection.timeout.ms=6000
group.id=test-consumer-group

据我所知,如果持有/代理groupcoordinator的节点和\uu consumer \u offset死亡,那么尽管为分区选择了新的领导者,但consumer仍无法恢复正常操作。
我在邮局看到类似的东西。这篇文章建议重新启动死掉的代理节点。然而,在生产环境中重新启动代理0之前,尽管有更多的节点,消息消耗仍会有延迟。
问题1:如何缓解上述情况?
问题2:有没有办法将groupcoordinator、consumer偏移量更改为另一个节点?
任何建议/帮助都将不胜感激。

pengsaosao

pengsaosao1#

检查\uu consumer\u offset主题中的复制因子。如果不是3,那就是你的问题。
运行以下命令 kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets 看看输出的第一行是否写着“replicat”ionfactor:1“或”复制ionfactor:3".
在尝试先设置一个节点,然后创建复制因子为1的主题时,这是一个常见的问题。稍后,当您扩展到3个节点时,您会忘记更改此现有主题的主题级别设置,因此即使您从中生成和使用的主题是容错的,偏移量主题仍然仅停留在代理0上。

相关问题