- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Previously, we reported a possible Scalapack bug in the PZUNGQR function
http://software.intel.com/en-us/forums/topic/473803
That issue has still not been resolved, but it was stated that it was an issue with zero-sized matrices on some nodes. However, we have encountered a somewhat similar issue with PZUNGQR even when the local matrices do no have zero-size. In the attached test case, the PZGEMM call that follows the PZUNGQR call will either hang or produce Irecv error even though the QR matrices and the PZGEMM matrices have non-zero sized matrices on all nodes. Interestingly, if the matrices used in the PZGEMM call have a global size less than the block size (only one node has non-zero sized matrices), then it completes fine.
In the attached test case, the bug only occurs if single-node matrices call ZUNQGR and multiple node matrices call PZUNGQR. If all nodes call PZUNGQR it does not occur. However, in our full code the bug seems to occur sometimes even if all nodes call PZUNGQR. Unfortunately, I was not able to reduce this particular behavior down to a simple test case.
Thanks, John
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the attached test case, the bug only occurs if single-node matrices call ZUNQGR and multiple node matrices call PZUNGQR. If all nodes call PZUNGQR it does not occur. However, in our full code the bug seems to occur sometimes even if all nodes call PZUNGQR. Unfortunately, I was not able to reduce this particular behavior down to a simple test case.
It just occurred to me that mixing PZUNGQR and ZUNGQR is not the issue. The primary issue is that some nodes call PZUNGQR and some don't. If the ZGEQRF/ZUNGQR call in the attached test case is commented out, the bug occurs since only the matrices that are really distributed call PZUNGQR and the single-node matrices do nothing for the QR.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
The first issue you reported earlier is still being investigated by the MKL team. Thank you very much for the additional information. We will make sure our fix covers this new scenario as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Zhang,
We appreciate that you are working on this. Do you have an approximate time estimate on this? This bug in the mkl library has been holding up a deliverable to our customer. I know you can't put a firm date on it. However, if you could estimate in days, weeks or months, it would be helpful. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
gedney@engr.uky.edu wrote:
Zhang,
We appreciate that you are working on this. Do you have an approximate time estimate on this? This bug in the mkl library has been holding up a deliverable to our customer. I know you can't put a firm date on it. However, if you could estimate in days, weeks or months, it would be helpful. Thank you!
Thanks for letting us know the impact of this issue on your deliveries. I'll send you a note with an estimate as soon as I have a solid idea on where we are now, hopefully in 1-2 business days.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Zhang,
We really appreciate that. Thank you!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page