ApacheKafka—垃圾收集有时需要很长时间

tpgth1q7  于 2021-06-07  发布在  Kafka
关注(0)|答案(1)|浏览(507)

在我们的kafka代理设置中,gc平均需要20毫秒,但它会随机增加到1-2秒。极端情况下会持续9秒。这种情况的发生频率是相当随机的。平均每天发生15次。我试过使用gceasy,但没有给出任何见解。我的内存利用率是20%,但进程仍然使用交换,尽管内存是可用的。感谢您对如何将此最小化的任何意见
jvm选项:

-Xmx3G -Xms1G -server -XX:MetaspaceSize=96m -XX:+UseG1GC -
XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=50 -
XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -
XX:MaxMetaspaceFreeRatio=80

gc日志:

2018-04-15T18:22:26.526+0530: 529212.538: [GC pause (G1 Evacuation Pause) (young), 9.6174677 secs]
       [Parallel Time: 9422.7 ms, GC Workers: 10]
          [GC Worker Start (ms): Min: 529212702.8, Avg: 529212736.2, Max: 529212778.8, Diff: 76.1]
          [Ext Root Scanning (ms): Min: 117.8, Avg: 228.8, Max: 320.4, Diff: 202.5, Sum: 2287.9]
          [Update RS (ms): Min: 83.7, Avg: 134.0, Max: 203.6, Diff: 119.9, Sum: 1339.8]
             [Processed Buffers: Min: 1, Avg: 5.3, Max: 11, Diff: 10, Sum: 53]
          [Scan RS (ms): Min: 390.7, Avg: 449.4, Max: 498.2, Diff: 107.6, Sum: 4493.6]
          [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
          [Object Copy (ms): Min: 8438.7, Avg: 8496.9, Max: 8532.1, Diff: 93.3, Sum: 84969.4]
          [Termination (ms): Min: 0.0, Avg: 18.8, Max: 21.5, Diff: 21.5, Sum: 187.9]
             [Termination Attempts: Min: 1, Avg: 3.8, Max: 9, Diff: 8, Sum: 38]
          [GC Worker Other (ms): Min: 2.5, Avg: 27.3, Max: 91.3, Diff: 88.8, Sum: 273.1]
          [GC Worker Total (ms): Min: 9312.4, Avg: 9355.2, Max: 9388.3, Diff: 75.9, Sum: 93551.7]
          [GC Worker End (ms): Min: 529222091.1, Avg: 529222091.3, Max: 529222091.7, Diff: 0.6]
       [Code Root Fixup: 1.9 ms]
       [Code Root Purge: 0.0 ms]
       [Clear CT: 12.1 ms]
       [Other: 180.8 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 18.0 ms]
          [Ref Enq: 0.0 ms]
          [Redirty Cards: 13.4 ms]
          [Humongous Register: 11.8 ms]
          [Humongous Reclaim: 0.2 ms]
          [Free CSet: 5.1 ms]
       [Eden: 768.0M(128.0M)->0.0B(128.0M) Survivors: 16.0M->16.0M Heap: 1775.9M(3072.0M)->1007.3M(3072.0M)]
     [Times: user=10.00 sys=15.22, real=9.63 secs]
x8goxv8g

x8goxv8g1#

您的日志显示,每个区域每个gc线程的单个时间很低(以100ms为单位),但整个垃圾收集的总体时间很长,对象复制阶段占用了这里的大部分时间。

[Parallel Time: 9422.7 ms, GC Workers: 10]
          [GC Worker Start (ms): Min: 529212702.8, Avg: 529212736.2, Max: 529212778.8, Diff: 76.1]
          [Ext Root Scanning (ms): Min: 117.8, Avg: 228.8, Max: 320.4, Diff: 202.5,**Sum: 2287.9**]
          [Update RS (ms): Min: 83.7, Avg: 134.0, Max: 203.6, Diff: 119.9,**Sum: 1339.8**]
             [Processed Buffers: Min: 1, Avg: 5.3, Max: 11, Diff: 10, Sum: 53]
          [Scan RS (ms): Min: 390.7, Avg: 449.4, Max: 498.2, Diff: 107.6,**Sum: 4493.6**]
          [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
        **[Object Copy (ms): Min: 8438.7, Avg: 8496.9, Max: 8532.1, Diff: 93.3, Sum: 84969.4]**
          [Termination (ms): Min: 0.0, Avg: 18.8, Max: 21.5, Diff: 21.5, Sum: 187.9]
             [Termination Attempts: Min: 1, Avg: 3.8, Max: 9, Diff: 8, Sum: 38]
          [GC Worker Other (ms): Min: 2.5, Avg: 27.3, Max: 91.3, Diff: 88.8, Sum: 273.1]
          [GC Worker Total (ms): Min: 9312.4, Avg: 9355.2, Max: 9388.3, Diff: 75.9, Sum: 93551.7]
          [GC Worker End (ms): Min: 529222091.1, Avg: 529222091.3, Max: 529222091.7, Diff: 0.6]

这就需要平均的收集时间来获得更高的价值。我怀疑这种行为没有什么原因,
g1heapregionsize=16m-这个参数的设置应该考虑可用内核的数量、允许的gc线程/工作线程等。最好让jvm为您选择这个值,除非您对此非常确定。
对象复制阶段是从幸存者空间复制对象到下一个空间的过程。看起来您正在短时间内创建大量新对象。如果可能的话,您可以尝试增加堆的大小,特别是年轻一代的大小。这将减少对象从一个幸存者空间复制到另一个幸存者空间的频率。
这种情况也可能发生,因为交换,但不确定从这些日志。禁用交换是值得一试的,看看是罪魁祸首。
如果您在cpu方面有更多的处理能力,那么可以尝试增加gc工作线程的数量。

相关问题