- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For my personal use I have implemented routines to compute natural logarithms on vectors in single precision. They are faster than the MKL "EP" version and essentially as accurate (ulp<0.95 vs. ulp<0.88) as the MKL "LA" version.
Basic algorithm is a 11th order optimal polynomial for 0.75<1.5, which means there are no tables. Only 240 bytes of memory are needed to store the constants.
The functions are:
//for large vectors
void log( float* y, float* x, long int n );
//for small vectors
__v4sf log( __v4sf x );
which you can get from here:
http://arithmex.com/files/vlog_check.tar.gz
(compiles on 64-bit linux with g++-4.2.1 )
It is obvious that a similar speed-up for 'sin', 'cos', 'exp', etc. can also be achieved. I also think it is possible to significantly speed-up the double precision versions as well, as every extra term in the series only adds 0.25 cycles/element, i.e. going from 11th to 21st order polynomial only adds 2.5 cycles/element, so I'd expect a double precision version to complete in about 12 cycles.
Personally I don't need these functions or the higher precision, but if anyone is interested...I'm for part-time hire. And because I really, really enjoy doing stuff like this, I'm a bargain.
Basic algorithm is a 11th order optimal polynomial for 0.75
The functions are:
//for large vectors
void log( float* y, float* x, long int n );
//for small vectors
__v4sf log( __v4sf x );
which you can get from here:
http://arithmex.com/files/vlog_check.tar.gz
(compiles on 64-bit linux with g++-4.2.1 )
It is obvious that a similar speed-up for 'sin', 'cos', 'exp', etc. can also be achieved. I also think it is possible to significantly speed-up the double precision versions as well, as every extra term in the series only adds 0.25 cycles/element, i.e. going from 11th to 21st order polynomial only adds 2.5 cycles/element, so I'd expect a double precision version to complete in about 12 cycles.
Personally I don't need these functions or the higher precision, but if anyone is interested...I'm for part-time hire. And because I really, really enjoy doing stuff like this, I'm a bargain.
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page