我想这与下面的链接有关,但我不明白。
- https://issues.apache.org/jira/browse/KAFKA-6535
- https://issues.apache.org/jira/browse/KAFKA-6150
- Kafka Streams deleting consumed repartition records, to reduce disk usage
可以retention.ms为Kafka流内部主题(如 *-changelog主题)提供主题配置(如“www.example.com“、“cleanup.policy”)以删除无用的日志。
但是当涉及到像 *-repartition主题这样的内部主题时,不可能提供主题配置值,即使repartition主题的默认“retention.ms“是“-1”,这意味着无限保留。我如何删除或管理repartition主题?否则repartition主题的大小将变得太大,可能会出现磁盘故障问题。
如何管理重新分区主题?什么是purgeData?在文档中找不到任何相关的解释。
2条答案
按热度按时间w8f9ii691#
Fact
retention.ms
for the repartition topics is -1 by default and there's no way to override this value in kafka-streams client code.What I misunderstood
retentions.ms
for the repartition topics is -1.Fix misunderstanding
maybeCommit
in theStreamThread
class.maybeCommit
method is called iteratively inside the loop that handles stream records.maybeCommit
method (version 2.7.1), there's a comment like below.try to purge the committed records for repartition topics if possible
retention.ms
for the repartition topics.Reference
Please leave a comment or correct this if I'm wrong.
1tuwyuhd2#
我在ksqldb上也遇到了同样的问题。内部主题在几天内就增长了TB的数据,默认情况下保留时间是无限的。我们修改了它们,将www.example.com设置retention.ms为某个值,而不是无限(-1),但之后一切都坏了。今天我执行了以下命令:
set topic.retention.ms=3600000
之后,我创建了一个表,所有的内部主题都是用retention.ms=1h而不是infinite创建的。下周将在prd环境中尝试,看看ksqldb(0.28.2)是否会驱逐段,一切正常。https://docs.confluent.io/platform/current/streams/developer-guide/config-streams.html#internal-topic-parameters希望能有所帮助