- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does line 301 contain a MACRO or template use where nested loops are used?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
c:\>type tstcase.cpp
void foo()
{
double LocationSecondary[16][1024] ;
double LocationPrim[1024] ; // Init was done earlier
double Delta[8] ; // Init was done earlier
for(int ii= 0 ; ii< 1024; ii++, LocationPrim++)
{
for(int jj = 0 ; jj< 8; jj++, LocationSecondary++ )
{
*LocationSecondary = *LocationPrim + Delta[jj] ;
}
for(int jj = 0 ; jj< 8 ; jj++, LocationSecondary++)
{
*LocationSecondary = *LocationPrim - Delta[jj] ;
}
}
}
c:\>icl -c /Qvec-report3 tstcase.cpp
Intel C++ Intel 64 Compiler XE for applications running on Intel 64, Ve
rsion 12.1.0.233 Build 20110811
Copyright (C) 1985-2011 Intel Corporation. All rights reserved.
tstcase.cpp
tstcase.cpp(7): error: expression must be a modifiable lvalue
for(int ii= 0 ; ii< 1024; ii++, LocationPrim++)
^
tstcase.cpp(9): error: expression must be a modifiable lvalue
for(int jj = 0 ; jj< 8; jj++, LocationSecondary++ )
^
tstcase.cpp(11): error: expression must be a modifiable lvalue
*LocationSecondary = *LocationPrim + Delta[jj] ;
^
tstcase.cpp(13): error: expression must be a modifiable lvalue
for(int jj = 0 ; jj< 8 ; jj++, LocationSecondary++)
^
tstcase.cpp(15): error: expression must be a modifiable lvalue
*LocationSecondary = *LocationPrim - Delta[jj] ;
^
compilation aborted for tstcase.cpp (code 2)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is not a modifiable lvalue type
typedef double[16][1024] d16x1024;
d16x24 LocationSecondary = new d16x24
...
for(int jj = 0 ; jj< 8; jj++, LocationSecondary++ )
now the above is valid
***
However, LocationSecondary++ advances to the 2nd d16x24 in the array
(and the original pointer is modified)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is not a modifiable lvalue type
typedef double[16][1024] d16x1024;
d16x24 LocationSecondary = new d16x24
...
for(int jj = 0 ; jj< 8; jj++, LocationSecondary++ )
now the above is valid
***
However, LocationSecondary++ advances to the 2nd d16x24 in the array
(and the original pointer is modified)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp]double LocationSecondary[16][1024]; double LocationPrimary[1024]; double Delta[8]; for (int i = 0; i < 1024; i++) { for (int j = 0; j < 8; j++) { LocationSecondaryWhat do you think Jim?= LocationPrimary + Delta ; } for (int j = 0; j < 8; j++) { LocationSecondary[8 + j] = LocationPrimary - Delta ; } } [/cpp]
If what I did is correct, then I would also suggest the following change:
[cpp]double LocationSecondary[16][1024]; double LocationPrim[1024]; double Delta[16]; // copy delta[0-7] to delta[8-15] negated // so that you can use + operator for both for (int i = 0; i < 1024; i++) { for (int j = 0; j < 16; j++) { LocationSecondaryMoreover, loops should be reversed because having large stride is inefficient.= LocationPrim + Delta ; } } [/cpp]
Finally, the answer to Yolanda's question about remarks is that the compiler sometimes employs loop transformation so that loops get split into vectorizable and partially vectorizable code if compiler determines it can do that safely -- that is why you get two remarks for the same loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think it would be more xmm register efficient to use:
[cpp]double LocationSecondary[16][1024]; double LocationPrimary[1024]; double Delta[8]; for (int i = 0; i < 1024; i++) { for (int j = 0; j < 8; j++) { LocationSecondary= LocationPrimary + Delta ; LocationSecondary[8 + j] = LocationPrimary - Delta ; } }[/cpp]
On SSE all of Delta could be xmm registerized into 4 registers. On x32 this would leave 4 xmm registers available for scratch. I agree with you that the user should consider
double LocationSecondary[1024][16]; // swap indexes
double LocationPrimary[1024];
double Delta[8];
for (int i = 0; i < 1024; i++) {
for (int j = 0; j < 8; j++) {
LocationSecondary
LocationSecondary[8 + j] = LocationPrimary - Delta
}
}
Provided that swap of index does not introduce performance penalty elsewhere.
** run a test to confirm performance change **
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a typical loop that would benefit from 3 operand syntax (since LocationPrimary could be reused without copying), and from AVX.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pentium D excludes use of architecture options more recent than SSE3.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page