- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Greetings,
I have recently written some code using AVX function calls to perform a convolution in my software. I have compiled and run this code on two platforms with the following compilation settings of note:
1. Windows 7 w/ Visual Studio 2010 on a i7-2760QM
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Only __inline(/Ob1)
Enable Intrinsic Functions: No
Favor Size or Speed: Favor fast code (/Ot)
2. Fedora Linux 15 w/ gcc 4.6 on a i7-3612QE
Flags: -O3 -mavx -m64 -march=corei7-avx -mtune=corei7-avx
For my testing I ran the C implementation and the AVX implementation on both platforms and got the following timing results:
In Visual Studio:
C Implementation: 30ms
AVX Implementation: 5ms
In GCC:
C Implementation: 9ms
AVX Implementation: 57ms
As you can tell my AVX numbers on Linux are very large by comparison. My concern and reason for this post is that I may not have a proper understanding of using AVX and the settings to properly them in both scenarios. For example, take my Visual Studio run. If I change the flag Enable Intrinsics to Yes, my AVX numbers go from 5ms to 59ms. Does that mean disabling the compiler to optimize with intrinsics and manually setting them in Visual Studio give that much better results? Last I checked there is nothing similar in gcc. Could Microsoft be that more capable of a better compile than gcc in this case? Any ideas why my AVX numbers on gcc are just that much larger? Any help is most appreciated. Cheers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- « Previous
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems that ecx contains pointer to aligned data which is accessed lineary(array index is incremented lineary) hence probably usage of
vmulps ymm3,ymm3,ymmword ptr[ecx] instruction.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
- Next »