- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would it be possible to get an optimized version of the routine cgeqrf, in order to speed up the QR factorization of tall, skinny matrices?
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
QR factorization routines are threaded but the efficiency of theirs implementation is depended on the input problem sizes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Could you also support the matrix size for the tall, skinny matrix? so we can track it for the future optimization consideration.
Thanks,
Chao
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Following http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-131.pdf supposedly it is not too hard to produce a DIY version. The algorithm is not quite easy to understand.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Investigators at Ohio State/Ohio Supercomputer center made versions of caqr using cilk+ and OpenMP. They offered to speak on it at SC11 conference. It links MKL but doesn't make significant use of it. The cilk+ organized for stride 1 inner loops runs well on Xeon 5680.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The matrix size of interest has a lead dimension of up to 250,000 with the other dimension on the order of one or two hundred.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page