Difference Between mem_load_uops_retired.l3_miss and offcore



我有一个Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz(Haswell)处理器。AFAIK,mem_load_uops_retired.l3_miss,统计DRAMdemand(即non-prefetch)的数据读访问次数offcore_response.demand_data_rd.l3_miss.local_dram,顾名思义,统计针对DRAM的demand数据读取的次数。. 因此,这两个事件似乎是等效的(或至少几乎)相同的)。但是根据下面的基准测试,前一个事件的频率要低得多

1)在C循环中初始化包含1000个元素的全局数组:

Performance counter stats for '/home/ahmad/Simple Progs/loop':
1,363      mem_load_uops_retired.l3_miss                                   
1,543      offcore_response.demand_data_rd.l3_miss.local_dram                                   
0.000749574 seconds time elapsed
0.000778000 seconds user
0.000000000 seconds sys

2)在Evince中打开PDF文档:

Performance counter stats for '/opt/evince-3.28.4/bin/evince':
936,152      mem_load_uops_retired.l3_miss                                   
1,853,998      offcore_response.demand_data_rd.l3_miss.local_dram                                   
4.346408203 seconds time elapsed
1.644826000 seconds user
0.103411000 seconds sys

3)运行Wireshark 5秒:

Performance counter stats for 'wireshark':
5,161,671      mem_load_uops_retired.l3_miss                                   
8,126,526      offcore_response.demand_data_rd.l3_miss.local_dram                                   
15.713828395 seconds time elapsed
0.904280000 seconds user
0.693906000 seconds sys

4)在Inkscape中对图像运行模糊滤镜:

Performance counter stats for 'inkscape':
13,852,121      mem_load_uops_retired.l3_miss                                   
23,475,970      offcore_response.demand_data_rd.l3_miss.local_dram                                   
25.355643897 seconds time elapsed
7.244404000 seconds user
1.019895000 seconds sys

所有四个在基准测试中,offcore_response.demand_data_rd.l3_miss.local_dram几乎是的两倍mem_load_uops_retired.l3_miss一样频繁。这个合理吗? 为什么?请告诉我,如果基准测试太复杂粗粒度和

!

就我目前所知,下表显示了这两个事件在Haswell上的区别:

<走加载页面<

相关内容

  • 没有找到相关文章

最新更新



  • All rights reserved © 2023 www.xiaobeizi.cn

  • 首页
mem_load_uops_retired.l3_missoffcore_response.demand _data_rd.l3_miss.local_dram
可缓存已退休负载top每行topY
缓存Non-Retired负载uopNY
unacheable WC Retired Load Uops每行一个事件N
unacheable UC Retired Load Uops可能发生N
不可访问的WC或UC非退休负载UopsNN
锁定任何类型对任何内存类型的加载都可能发生我不知道
遗留IO请求/td>可能发生N
L1D预取NY
L2预取到L2和L3NN
不打算写的软件预取NY
NY
服务单位/td>任何本地DRAM
可靠性可能不可靠可靠