我们刚刚交付了一台功能强大的32核AMD Opteron服务器,128Gb。我们有2 x 6272个CPU,每个CPU有16个内核。我们正在30个线程上运行一个大型的长期运行的java任务。我们启用了针对Linux和java的NUMA优化。我们的java线程主要使用该线程专用的对象,有时读取其他线程将读取的内存,偶尔写入或锁定共享对象。
我们无法解释为什么CPU内核有25%空闲。以下是"顶部"的转储:
top-23:06:38上升1天,23分钟,3个用户,平均负载:10.84,10.27,9.62任务:总共676个,1个运行,675个睡眠,0个停止,0个僵尸Cpu(s):64.5%us、1.3%sy、0.0%ni、32.9%id、1.3%wa、0.0%hi、0.0%si、0.0%st内存:总计132138168k,使用131652664k,空闲485504k,缓冲92340k交换:总计5701624k,使用230252k,空闲5471372k,缓存1344434k。。。top-22:37:39上升23:54,3个用户,平均负载:7.83,8.70,9.27任务:共678个,1个运行,677个睡眠,0个停止,0个僵尸Cpu0:75.8%us,2.0%sy,0.0%ni,22.2%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu1:77.2%us,1.3%sy,0.0%ni,21.5%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu2:77.3%us,1.0%sy,0.0%ni,21.7%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu3:77.8%us,1.0%sy,0.0%ni,21.2%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu4:76.9%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu5:76.3%us,2.0%sy,0.0%ni,21.7%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu6:12.6%us、3.0%sy、0.0%ni、84.4%id、0.0%wa、0.0%hi、0.0%si、0.0%stCpu7:8.6%us,2.0%sy,0.0%ni,89.4%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu8:77.0%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu9:77.0%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu10:77.6%us,1.7%sy,0.0%ni,20.8%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu11:75.7%us,2.0%sy,0.0%ni,21.4%id,1.0%wa,0.0%hi,0.0%si,0.0%stCpu12:76.6%us,2.3%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu13:76.6%us,2.3%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu14:76.2%us,2.6%sy,0.0%ni,15.9%id,5.3%wa,0.0%hi,0.0%si,0.0%stCpu15:76.6%us,2.0%sy,0.0%ni,21.5%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu16:73.6%us,2.6%sy,0.0%ni,23.8%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu17:74.5%us,2.3%sy,0.0%ni,23.2%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu18:73.9%us,2.3%sy,0.0%ni,23.8%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu19:72.9%us,2.6%sy,0.0%ni,24.4%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu20:72.8%us、2.6%sy、0.0%ni、24.5%id、0.0%wa、0.0%hi、0.0%si、0.0%stCpu21:72.7%us,2.3%sy,0.0%ni,25.0%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu22:72.5%us,2.6%sy,0.0%ni,24.8%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu23:73.0%us,2.3%sy,0.0%ni,24.7%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu24:74.7%us,2.7%sy,0.0%ni,22.7%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu25:74.5%us,2.6%sy,0.0%ni,22.8%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu26:73.7%us,2.0%sy,0.0%ni,24.3%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu27:74.1%us,2.3%sy,0.0%ni,23.6%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu28:74.1%us,2.3%sy,0.0%ni,23.6%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu29:74.0%us,2.0%sy,0.0%ni,24.0%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu30:73.2%us,2.3%sy,0.0%ni,24.5%id,0.0%wa,0.0%hi,0.0%si,0.0%stCpu31:73.1%us,2.0%sy,0.0%ni,24.9%id,0.0%wa,0.0%hi,0.0%si,0.0%st内存:总计132138168k,使用131711704k,空闲426464k,缓冲区88336k交换:总计5701624k,使用229572k,空闲5472052k,缓存13745596kPID用户PR NI VIRT RES SHR S%CPU%MEM TIME+命令13865根20 0 122克112克3.1克S 2334.3 89.6 20726:49 java27139 jayen 20 0 15428 1728 952 S 2.6 0.0 0:04.21顶部27161 sysadmin 20 0 15428 1712 940 R 1.0 0.0 0:00.28顶部33根20 0 0 0 S 0.3 0.0 0:06.24 ksoftirqd/7131根20 0 0 0 S 0.3 0.0 0:09.52事件/01858 root 20 0 0 0 S 0.3 0.0 1:35.14 kondemand/0
java堆栈的转储确认,没有任何线程位于使用锁的少数位置附近,也没有任何线程靠近任何磁盘或网络i/o。
我很难清楚地解释"top"在"空闲"one_answers"等待"中的含义,但我得到的印象是"空闲"意味着"不再需要运行线程",但这在我们的情况下没有意义。我们使用的是"Executors.newFixedThreadPool(30)"。有大量任务挂起,每个任务持续10秒左右。
我怀疑这个解释需要对NUMA有很好的理解。当CPU等待非本地访问时,您看到的是"空闲"状态吗?如果没有,那么解释是什么?
可能有很多事情:
-
这可能是线程之间对共享数据访问的争用。这可能采取锁争用的形式,或者由于读或写障碍而导致的额外内存流量,尽管后者不太可能产生这些症状。
-
您正在泄漏工作线程;例如,它们偶尔会死亡而没有被替换。
-
执行器本身可能存在瓶颈;例如,它可能对通过调度下一个任务而完成的任务没有足够快的响应。
-
瓶颈可能是垃圾收集器,尤其是在没有启用并行收集的情况下。
本页介绍了Java的NUMA增强,并提到了支持NUMA的GC开关。试试看。还可以查看该页面上的其他GC调优建议。
这个问题解释了过程状态:在linux中,";顶部";命令意味着?。
我认为处理器摘要中"wa"one_answers"空闲"时间的区别在于,"wa"表示处理器有处于"D"状态的线程;即等待磁盘i/O。相比之下,所有线程都在"S"状态下等待的处理器将被视为"空闲"。(从这个角度来看,正在等待锁的线程将处于S状态。)
您也可以尝试top -H
,它单独显示线程。