- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

i have a question on how to optimize some code i am working on. The heavy part of the code is a function to compute some quantities, no loops, just lots of float point computations (over 2000 lines). This function will be called millions of times for different inputs, so it is very critical to the overall performance. Also the computation has pretty a lot of if-else checkings.

I tried to run my program in vtune, and clock ticks per instruction i got was 4.5. According to vtune's manual it seems to be a very bad number. So i wonder if there is any tips on how to improve it?

One idea i have now is to break long expressions in a bunch of short ones, for example,

a=b*c*e*f;

becomes,

a1=b*c;

a2=e*f;

a=a1*a2;

so cpu is more likely to do more instructions in clock cycle? But is this true? I will really appreciate any advice?

Best,

Ben

Link Copied

3 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

BEN,

Be sure to look at the intel Math Kernel Libraries, which are highly optimized, thread-safe math routines for High-Performance Computing (HPC) science, engineering, and financial applications that require maximum performance on Intel processors.

If you go to our URL you can purchase them or kick the tires on an eval copy: sounds like you might find a definite use for them based on your posting!

cheers

jdg

Message Edited by jdgallag on 03-28-2006 04:47 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Jdg,

thanks a lot for the reply. Yes, i did take a look at the intel math kernel lib. But what i found is those function were exclusively designed for vector operations. But for my case, all the computations are scalar-based.:-( Is my understanding correct?

Best,

Ben

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page