- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I was trying to calculate the time my application spends in the cache by using the published specs of 4 cycles in L1, 12 in L2, etc., along with the number of L1, L2 loads that my application makes. I realized that since sandy bridge can issue 2 loads in one cycle, my results were wonky.
My question is: how do I go about calculating the time spent in cache alone, given that there a parallel loads possible to L1?
Regards,
Chinnappa
I was trying to calculate the time my application spends in the cache by using the published specs of 4 cycles in L1, 12 in L2, etc., along with the number of L1, L2 loads that my application makes. I realized that since sandy bridge can issue 2 loads in one cycle, my results were wonky.
My question is: how do I go about calculating the time spent in cache alone, given that there a parallel loads possible to L1?
Regards,
Chinnappa
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chinnappa,
Intel processors use what is known as Out of Order architecture. This means that there are many instructions that are active at any given time. When an instruction misses in a cache level - or is delayed for any other reason - and has to wait, another instruction maybe able to progress, and so it is not possible to assume that while an instruction is waiting for cache response nothing else is happening.
I suggest you read up on OOO (http://en.wikipedia.org/wiki/Out-of-order_execution and follow the links to additional details.
Once you have done that, you can browse through the articles on the Platform Performance Monitoring portal: http://software.intel.com/en-us/articles/platform-monitoring/ which will have articles that explain how to understand the specific latencies and bottlenecks that affect your programs.
I hope this helps,
Hussam
Intel processors use what is known as Out of Order architecture. This means that there are many instructions that are active at any given time. When an instruction misses in a cache level - or is delayed for any other reason - and has to wait, another instruction maybe able to progress, and so it is not possible to assume that while an instruction is waiting for cache response nothing else is happening.
I suggest you read up on OOO (http://en.wikipedia.org/wiki/Out-of-order_execution and follow the links to additional details.
Once you have done that, you can browse through the articles on the Platform Performance Monitoring portal: http://software.intel.com/en-us/articles/platform-monitoring/ which will have articles that explain how to understand the specific latencies and bottlenecks that affect your programs.
I hope this helps,
Hussam

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page