- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
I was reading up on cpu cache and memory and I came upon some stackoverflow questions which seem to indicate that unaligned memory access used to be slower on older intel processors. But I also found that newer intel processors (I believe starting from sandy bridge) don't have this performance penalty accessing unaligned memory.
Here is my understanding so far. Let's assume 64 bit data bus, dram has 64 data pins (i.e., word size 64 bits), dram has burst length 8 with sequential burst type. Considering cpu wants to access a 4 byte unaligned data which is at address 62 (i.e., 62, 63, 64, 65) in memory and is not in cache yet, any old or new intel processors will need to do 2 memory read accesses from dram. The first access will return bytes from 0 - 63 in bursts and the second access will return bytes from 64 - 127. This is one reason of slowness regardless of old or new cpu because 2 dram read accesses are required as opposed to only one read access in case of aligned access.
Question 1: Is my above assumption correct that older or newer cpus all will have to do 2 memory accesses when the required unaligned data from the above example which is not in cache yet causing slowness i.e., a lot of memory cycles?
In case the data is already in cache, the 4 byte data starting at address 62 will be split into 2 cache lines of size 64 bytes.
Question 2: Is it just that older intel processors did not have enough hardware support that accessing the proper value from 2 cache lines and then stitching them together would take more clock cycles than aligned access?
Question 3: In older intel processors, would it be slower to access unaligned memory that wasn't split into multiple cache lines for example 4 byte data at address 59 (i.e., 59,60,61,62) which will be in a single cache line? If so, why? (I am not sure if it's related to a topic called "cache bank" but I am not familiar enough with that topic so might need some hints about it too if it's related)
Question 4: Why is unaligned memory access from cache not slow in newer intel processors?
It would be great if these questions get answered with some details or if I get pointed to some resources which would help me to get the actual answers.
Thanks.
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page