- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi There!
We are developing a supernodal factorizer for a sparse symmetric system. (mainly to understand Intel optimizations and how to go about writing a neat solver..)
We use double precision andwe are ~2.5x slower than an equivalent MKL based system. (on a single core - we run only on single core systems as part of preliminary testing)
We use cache-aligned structures and SSE2.
In this context, I am just wondering what this FTZ (Flush to Zero) isall about.... The solver output does contain numbers with very very less magnitude ( x.yyyE-16 ). Will enabling FTZ give us some performance?
btw, Does MKL internally use single precision whenever possible? (just wondering..) Is there any envmt variables that can control such behaviors?
Thanks for any advice,
Best REgards,
Sarnath
We are developing a supernodal factorizer for a sparse symmetric system. (mainly to understand Intel optimizations and how to go about writing a neat solver..)
We use double precision andwe are ~2.5x slower than an equivalent MKL based system. (on a single core - we run only on single core systems as part of preliminary testing)
We use cache-aligned structures and SSE2.
In this context, I am just wondering what this FTZ (Flush to Zero) isall about.... The solver output does contain numbers with very very less magnitude ( x.yyyE-16 ). Will enabling FTZ give us some performance?
btw, Does MKL internally use single precision whenever possible? (just wondering..) Is there any envmt variables that can control such behaviors?
Thanks for any advice,
Best REgards,
Sarnath
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ftz is set by default in ifort. It actually affects only the initialization compiled into the main program. On current and past CPUs, it avoids the extra time spent producing subnormal results ( 0 < abs(x) < tiny(x) ). Unlike Itanium, ftz has no effect on performance when there is no gradual underflow.
MKL would normally be expected to use internally the same precision as the arguments.
Any library function which alters ftz setting (other than those meant for changing the setting) is expected to restore it to what it was at entry. You can change it at run time, at the cost of the function call and pipeline flush.
CPUs to be introduced next year are designed so that gradual underflow doesn't degrade performance in the usual contexts. This should preserve performance of vectorized code without requiring ftz.
MKL would normally be expected to use internally the same precision as the arguments.
Any library function which alters ftz setting (other than those meant for changing the setting) is expected to restore it to what it was at entry. You can change it at run time, at the cost of the function call and pipeline flush.
CPUs to be introduced next year are designed so that gradual underflow doesn't degrade performance in the usual contexts. This should preserve performance of vectorized code without requiring ftz.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the link and the reply! It was useful.
I think I need to really block my matrix out completely. (right now ,I create blocks on the fly as the factorization proceeds...) may be, I should consider re-structuring the data structure from scratch...
MKL is realllly ossum.... It looks like a big wall in front of me.... All I have been doing is to bang my head against it all the time.... Grrr......May be, I should read some papers and get some basics right...
Anyway, Thanks to all you guys,
Best Regards,
Sarnath
I think I need to really block my matrix out completely. (right now ,I create blocks on the fly as the factorization proceeds...) may be, I should consider re-structuring the data structure from scratch...
MKL is realllly ossum.... It looks like a big wall in front of me.... All I have been doing is to bang my head against it all the time.... Grrr......May be, I should read some papers and get some basics right...
Anyway, Thanks to all you guys,
Best Regards,
Sarnath

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page