Java GC CMS暂停时间随着时间的推移而增加



我们有一个问题,大约间隔8小时。。我们的节点GC暂停时间增加,最终获得其最长的暂停时间约0.8-1.2秒(一个完整的GC(。。然后它回落到约0.1秒或8小时后恢复。。周期

我们在2个核心框和4个核心框上一致地看到了~8小时的波形,4个核心盒的平均停顿时间往往较小,但峰值是相同的,即在我们的2个核心盒上同样糟糕。

很明显,当GC运行时,它在每次运行中都做得不够,因为我们继续看到堆/暂停时间在8小时内增加。。所以我的问题是,我们如何才能迫使GC循环更加有效。。而不是等到它需要一个长的完整gc周期。(我想把堆I减半…会迫使它更频繁地运行……希望能导致"一半"的暂停时间峰值……但这并不理想(

(顺便说一下,我们在java 8,4GB的盒子上运行这些jvm(。创建的对象通常是短暂的(系统本质上代表一个队列,有很多对象进入,有很多物体离开。通常在几秒钟内。它只是对象的恒定流动,cpu使用率"低"(。

我们的GC参数

-XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms3000M
-Xmx3000M
-Xmn1500M
-XX:+HeapDumpOnOutOfMemoryError
-Xss256k
-XX:StringTableSize=1000003
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB
-XX:+PerfDisableSharedMem
-XX:CompileCommandFile=./../conf/hotspot_compiler
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSWaitDuration=10000
-XX:+UseCondCardMark
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-Xloggc:/logs/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5
-XX:GCLogFileSize=100M

GC日志的2个事件片段显示了长GC(以及cms remark final上的长scrub字符串表(,我们看到的第二个事件我认为也是非常高的类卸载时间。。(我们不进行任何类的动态加载(。

2018-07-17T06:20:43.738+0000: 33759.362: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1154831K(1536000K)] 1165177K(2918400K), 0.0048510 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.743+0000: 33759.367: Total time for which application threads were stopped: 0.0082435 seconds
2018-07-17T06:20:43.743+0000: 33759.367: [CMS-concurrent-mark-start]
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-mark: 0.093/0.093 secs] [Times: user=0.11 sys=0.00, real=0.09 secs] 
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-preclean-start]
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-abortable-preclean-start]
CMS: abort preclean due to time 2018-07-17T06:20:48.887+0000: 33764.511: [CMS-concurrent-abortable-preclean: 4.699/5.043 secs] [Times: user=5.58 sys=0.02, real=5.04 secs] 
2018-07-17T06:20:48.892+0000: 33764.517: [GC (CMS Final Remark) 
[YG occupancy: 89219 K (1382400 K)]
33764.517: [Rescan (parallel) , 0.0256303 secs]
33764.543: [weak refs processing, 0.0124976 secs]
33764.555: [class unloading, 0.0593605 secs]
33764.615: [scrub symbol table, 0.0028103 secs]
33764.617: [scrub string table, 0.3960129 secs]
[1 CMS-remark: 1154831K(1536000K)] 
1244051K(2918400K), 0.5055637 secs] 
[Times: user=0.58 sys=0.00, real=0.50 secs] 
2018-07-17T06:20:49.399+0000: 33765.023: Total time for which application threads were stopped: 0.5111839 seconds
2018-07-17T06:20:49.399+0000: 33765.023: [CMS-concurrent-sweep-start]
2018-07-17T06:20:49.408+0000: 33765.032: Total time for which application threads were stopped: 0.0034313 seconds
2018-07-17T06:20:49.411+0000: 33765.035: Total time for which application threads were stopped: 0.0022739 seconds
2018-07-17T06:20:49.415+0000: 33765.039: Total time for which application threads were stopped: 0.0022818 seconds
2018-07-17T06:20:49.418+0000: 33765.042: Total time for which application threads were stopped: 0.0020566 seconds
2018-07-17T06:20:49.420+0000: 33765.044: Total time for which application threads were stopped: 0.0017496 seconds
2018-07-17T06:20:49.423+0000: 33765.047: Total time for which application threads were stopped: 0.0022639 seconds
2018-07-17T06:20:49.425+0000: 33765.049: Total time for which application threads were stopped: 0.0016516 seconds
2018-07-17T06:20:49.427+0000: 33765.051: Total time for which application threads were stopped: 0.0018342 seconds
2018-07-17T06:20:49.429+0000: 33765.053: Total time for which application threads were stopped: 0.0015327 seconds
2018-07-17T06:20:49.430+0000: 33765.054: Total time for which application threads were stopped: 0.0015723 seconds
2018-07-17T06:20:49.432+0000: 33765.056: Total time for which application threads were stopped: 0.0015356 seconds
2018-07-17T06:20:49.433+0000: 33765.058: Total time for which application threads were stopped: 0.0014560 seconds
2018-07-17T06:20:49.436+0000: 33765.060: Total time for which application threads were stopped: 0.0019394 seconds
2018-07-17T06:20:49.437+0000: 33765.061: Total time for which application threads were stopped: 0.0015594 seconds
2018-07-17T06:20:49.460+0000: 33765.084: Total time for which application threads were stopped: 0.0028257 seconds
2018-07-17T06:20:50.464+0000: 33766.088: Total time for which application threads were stopped: 0.0031651 seconds
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-sweep: 1.269/1.303 secs] [Times: user=1.58 sys=0.00, real=1.31 secs] 
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-reset-start]
2018-07-17T06:20:50.706+0000: 33766.330: [CMS-concurrent-reset: 0.004/0.004 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2018-07-17T06:20:51.469+0000: 33767.093: Total time for which application threads were stopped: 0.0046566 seconds

2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
[YG occupancy: 56180 K (1382400 K)]
513596.243: [Rescan (parallel) , 0.0230426 secs]
513596.266: [weak refs processing, 0.0272667 secs]
513596.293: [class unloading, 0.1770894 secs]
513596.471: [scrub symbol table, 0.0101295 secs]
513596.481: [scrub string table, 0.3882333 secs]
[1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 
2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds

2018-07-22T19:37:55.241+0000: 513590.865: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1152442K(1536000K)] 1173594K(2918400K), 0.0309287 secs] [Times: user=0.02 sys=0.00, real=0.03 secs] 
2018-07-22T19:37:55.272+0000: 513590.896: Total time for which application threads were stopped: 0.0344867 seconds
2018-07-22T19:37:55.272+0000: 513590.896: [CMS-concurrent-mark-start]
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-mark: 0.170/0.170 secs] [Times: user=0.16 sys=0.00, real=0.17 secs] 
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-preclean-start]
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-abortable-preclean-start]
CMS: abort preclean due to time 2018-07-22T19:38:00.615+0000: 513596.239: [CMS-concurrent-abortable-preclean: 3.434/5.164 secs] [Times: user=4.10 sys=0.11, real=5.16 secs] 
2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
[YG occupancy: 56180 K (1382400 K)]
513596.243: [Rescan (parallel) , 0.0230426 secs]
513596.266: [weak refs processing, 0.0272667 secs]
513596.293: [class unloading, 0.1770894 secs]
513596.471: [scrub symbol table, 0.0101295 secs]
513596.481: [scrub string table, 0.3882333 secs]
[1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 
2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds
2018-07-22T19:38:01.255+0000: 513596.880: [CMS-concurrent-sweep-start]
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-sweep: 0.953/0.953 secs] [Times: user=1.10 sys=0.00, real=0.96 secs] 
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-reset-start]
2018-07-22T19:38:02.213+0000: 513597.837: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2018-07-22T19:38:02.260+0000: 513597.884: Total time for which application threads were stopped: 0.0045430 seconds

看起来您正在插入太多的字符串。为什么你需要一个100万大小的实习生字符串表?那些英年早逝并且可以由年轻的收藏品清理的绳子不应该被扣留。只有最有可能在旧一代收藏中幸存下来的字符串才应该被扣留。或者,你需要节省这么多字符串的内存吗?

最新更新