- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I would like to know the size of data fetched when I do a prefetch on an pointer with the instruction PREFETCHh (prefetcht0, prefetcht1, prefetcht2 or prefetchnta).
In the Intel 64 an IA-32 Architectures Software Developer's Manual, I can read this :
"These instructions fetch 32 aligned bytes (or more, depending on the implementation) containing the addressed byte to a location in the cache hierarchy specified by the temporal locality hint."
So, the minimum size of data fetched is 32 bytes but how to know the real size according the implementation ?
I need to know this because I work on an image and I want to prefetch several pixels around another pixel, so I need to know how many prefetch instruction must I do.
Thanks, nicolas
Example :
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I believe all modern processors use 64-byte cache lines That is the granularity of a prefetch.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you take a look at the cpuid instruction documented in the Software Developer's Manual (http://www.intel.com/Assets/PDF/manual/253666.pdf) you will find that Intel's answer is rather complicated. The prefetch size may be 32, 64, or 128 Bytes. And the only correct way to know is to find out via cpuid. cpuid itself is crazy - I just spent 5 hours to implement the latest spec, and of course the overlap between AMD and Intel in that regard is marginal.
Anyway, all the systems I have access to, be that Intel or AMD, have answered with a prefetch size of 64 Bytes. As far as I know only some Intel CPUs of family 15 actually had a prefetch size of 128 Bytes. But I never had one of those. And 32 Bytes is probably Pentium 2/3 times...
Anyway, all the systems I have access to, be that Intel or AMD, have answered with a prefetch size of 64 Bytes. As far as I know only some Intel CPUs of family 15 actually had a prefetch size of 128 Bytes. But I never had one of those. And 32 Bytes is probably Pentium 2/3 times...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On some CPUs (e.g. early P4), when alternate sector prefetch is active, prefetching a cache line could trigger the prefetch of the companion cache line, or 128 bytes in all. The linking of software and hardware prefetch was removed in more recent CPUs, AFAIK.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your answer,
With this small program, I confirm that the prefetch instruction fetch 64 bytes in the cache.
The program :
#include
int main(int argc, char *argv[])
{
int input = 0x2, eax, ebx, ecx, edx;
asm ("movl %0, %%eax;"::"r"(input));
asm ("cpuid;");
asm ("movl %%eax, %0;":"=r" (eax));
asm ("movl %%ebx, %0;":"=r" (ebx));
asm ("movl %%ecx, %0;":"=r" (ecx));
asm ("movl %%edx, %0;":"=r" (edx));
printf("eax = 0x%08.8X, \nebx = 0x%08.8X, \necx = 0x%08.8X, \nedc = 0x%08.8X \n", eax, ebx, ecx, edx);
return 0;
}
The result on my X5670 is :
eax = 0x55035A01,
ebx = 0x00F0B2FF, <== bits 23-16 = F0 = Prefetch : 64-Byte prefetching*
ecx = 0x00000000,
edc = 0x00CA0000
* :Table 3-25. Encoding of CPUID Leaf 2 Descriptors in the Intel 64 and IA-32 Architectures Software Developer's Manual (Volume 2A: Instruction Set Reference, A-M)
Thanks, nicolas

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page