- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Firend,
I am new in Xe_sudio composer of intel. I have good knowlege of Parallel Programing on GPU with CUDA and OPenCL. I want to learen intel xe composer icc , mkl & ipp. I have read all installtion guide and tutorial. But Can any one suggest me how will i start programing.
That Means,
How I will use single core and multiple core of my processor.
How will i divide my execution on diffrent cores.
Please Help me! I am using Intel i7 Processor.
Thanks,
Link Copied
- « Previous
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TimP (Intel) wrote:Hi Tim For my test I used Visual Studio 2010 and I choose to completely disable any optimization on the side of VS 2010 C/C++ compiler.It was done solely for the sake of comparision between thos two compilers.As I wrote in my previous post soon I will create a thread when I will test both of the compilers. I was not aware that VS 2010 compiler is not using SIMD vector instruction were optimization setting were choosen.I'm mildly curious as to which Microsoft version you consider as "the" Microsoft version. MSVC in VS2012 is the first to make any use of simd instructions, but of course you must specify /arch:SSE2 (preferably AVX) if you wish this in the 32-bit version.
Specification of unsigned long rather than int looks like an unnecessary handicap, as well as having differing meaning on non-Windows platforms.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For comparision the same algorithm operating on the same data set.Loop iterated 1e4 times. The results: Intel compiler testcase start value is 9510476 msec Intel compiler testcase end value is 9511396 msec Intel compiler resulting overhead is 920 msec
For anyone still interested in FFT testing I got new more accurate results.Instead of calling 1e6 times fourier() routine and measuring time of execution I measured with the help of compiler intrinisnc function __rdtsc() first for-loop block(responsible for divding data into odd and even parts) and while loop block(main execution body) of the function.The results were as I stated earlier were more accurate.
For FFT 4096 point sine function transform the speed of execution was ~212145 nanoseconds i.e 212microseconds.
Later I will continue on my evaluation of the various function beign trnasformed and time needed to accomplish that.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
- Next »