- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Foelsche,
It may not matter very much what the assembly code looks like depending on the size of the arrays and how random the accesses to the pDatat array become because ot the indirect indexing from the pFactorPositions array.
How big are the arrays involved? The worst case is if Pdatat is very large ( larger than last level cache) if pDatat[pFactorPosition[index]] more or less results in random memory accesses for the pDatat locations. Then it won't matter what the assembly looks like, you'll be mostly waiting on memory.
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that the pData and pFactorPositions and pData arrays are only of size 1000 elements, they should all fit in cache. If you wanted to see if the indirect addressing is causing havoc you could try setting pFactorPositions to have sequential values.
But I'm guessing that you are getting 1 or more instruction executed per clocktick.
Do you have any CPI or IPC stats for the loop?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As Pat said it possibly does not maatter what assembly looks like.There is dependency on the indirect array indexing.Becuse of this data prefetching could not expect data spatial locality.
Sorry my mistake source array pData[iSource] has spatial locality because of lineary increased index.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page