- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following code is to use FPU. I run it on E5-2620. It only upto 2 GFlops. If I want to 2*8 GFlops, how could I code program?
Any help will be appreciated.
void* test_pd_avx()
{
double x[4]={12.02,14.34,34.23,234.34};
double y[4]={123.234,234.234,675.34,3453.345};
__m256d mx=_mm256_load_pd(x);
__m256d my=_mm256_load_pd(y);
for(;;)
{
__m256d mz=_mm256_mul_pd(mx,my);
}
}
The Compiler Option: icc test.c -O0
Link Copied
- « Previous
-
- 1
- 2
- Next »
23 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok:)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>...The Compiler Option: icc test.c -O0
Try performance evaluations with -O2 and -O3 compiler options.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey Kostrov wrote:
>>...The Compiler Option: icc test.c -O0
Try performance evaluations with -O2 and -O3 compiler options.
-O2 or -O3 will optimize the code, and it's behavior undefined.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »