- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
{
char *n=new char
for(j=0;j<1000;j++)
for(i=0;i<10000000;i++)m+=n[((i+j)*PAR+i)&(s-1)];
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
{
char *n=new char
for(j=0;j<1000;j++)
for(i=0;i<10000000;i++)m+=n[((i+j)*PAR+i)&(s-1)];
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To my regret, overflow is not a problem. You can get rid of + (just put m=.... in my script). It changes nothing. Program runs very slow.
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you look at your code as to how it advanced through memory you will find
PAR=8
0, 9, 18, 27, 36, ...
PAR = 128
0, 129, 258, 387, ...
The PAR=128 advances faster through memory. This can affect performance in two ways. Most processors have in their memory management system a thing called TLB (Translation Lookaside Buffer). This a type of cache. Though instead of storing the data (in the data cache)it stores the address ranges of what data is stored in the data cache. This in itself could not account for the 10x difference. The next thing to look at the multiplier of PAR is what is called a scaling operation. The IA32 instruction set is capable of scaling at 1, 2, 4 and 8 without using a multiplication instruction.I believethis is where the performance hit comes.
You can verify this hypothesis by timeing runs with PAR set to 1, 2, 3, 4, 5, 6, 7, 8, 9, 19, 11, 12, 13, 14, 15, 16, 17, ...128
What you should find is the order of speed might be 1, 2, 4, 8, 3, 6, 5, 7, ...
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I double-checked assembly listing and found that compiler automatically uses << instead of * if operand is 2**n.
So it is not the problem. I tried127 instead of 128. It takes less than 1% of test time.
My application deals with real-time in-memory database. The database structure is a kind of multi-tied lists.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page