Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® Integrated Performance Primitives
- Principle of locality

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

Hi,

Say I am doing something like this:

...

ippsConj_32fc(X,Tmp,N);

ippsMul_32fc(X,Tmp+1,Rxx,N-1);

ippsMul_32fc(Tmp,Y,Rxy,N);

ippsMagnitude_32fc(Rxy,A2,N);

ippsPhase_32fc(Rxy,P2,N);

ippsConj_32fc(Y,Tmp,N)

ippsMul_32fc(Y,Tmp+1,Ryy,N-1);

ippsAdd_32fc_I(Rxx,Ryy,N);

ippsMagnitude_32fc(Ryy,A1,N);

ippsPhase_32fc(Ryy,P1,N);

...

with N being e.g. 512.

Would it be faster to do all this in one for loop over the vector length N using intrinsics due to the principal of locality? And if possible parallellize using OMP or TBB. What are your thoughts on this?

Thanks,

Thor Andreas

thorsan

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-16-2009
11:35 AM

11 Views

Principle of locality

Say I am doing something like this:

...

ippsConj_32fc(X,Tmp,N);

ippsMul_32fc(X,Tmp+1,Rxx,N-1);

ippsMul_32fc(Tmp,Y,Rxy,N);

ippsMagnitude_32fc(Rxy,A2,N);

ippsPhase_32fc(Rxy,P2,N);

ippsConj_32fc(Y,Tmp,N)

ippsMul_32fc(Y,Tmp+1,Ryy,N-1);

ippsAdd_32fc_I(Rxx,Ryy,N);

ippsMagnitude_32fc(Ryy,A1,N);

ippsPhase_32fc(Ryy,P1,N);

...

with N being e.g. 512.

Would it be faster to do all this in one for loop over the vector length N using intrinsics due to the principal of locality? And if possible parallellize using OMP or TBB. What are your thoughts on this?

Thanks,

Thor Andreas

0 Replies

For more complete information about compiler optimizations, see our Optimization Notice.