Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2236 Discussions

Bug with Intel MPI Message Queue Debugging

Josh_Linaro
Beginner
1,977 Views
Hey Intel,
 
I am seeing some issues with the Intel MPI "Message Queue Debugger" system that I was wondering if you can look into his bug?
 
This issue can be seen when using the Message Queue debugging with Linaro Forge, it appears as though the MQS is not working as expected.
 
More specifically, calls to mqs_image_has_queues are always returning 0 (mqs_ok) but also includes the following message:
The symbols and types in the MPICH library used by TotalView
to extract the message queues are not as expected in
the image '%s'
No message queue display is possible.
This is probably an MPICH version or configuration problem.
It is worth noting that there were several mqs_field_offset callbacks made at this time to elements that appear to not exist:
 
fieldOffset : mqs_type = 0x557143a2f9e0, char* = unexp_head_ptr
 
$(gdb) output ((int)&((MPIDIG_comm_t *)1)->unexp_head_ptr)-1"
"There is no member named unexp_head_ptr."
fieldOffset : mqs_type = 0x7fa8d9a72830, char* = comm_next
 
$(gdb) output ((int)&((_MPIR_Comm *)1)->comm_next)-1
"There is no member named comm_next"
fieldOffset : mqs_type = 0x563bf7816460, char* = am

$(gdb) output ((int)&((MPIDIG_comm_t *)1)->unexp_head_ptr)-1\n"
"There is no member named unexp_head_ptr."
Perhaps because of the above, it appears that every sequential mqs_setup_operation_iterator call returns either 1 or 2.
 
You can see https://github.com/pmodels/mpich/issues/6131 for an example of a similar issue with MQD on mpich (and a reproducer if you have access to Linaro Forge to test with).
Labels (1)
0 Kudos
5 Replies
ShivaniK_Intel
Moderator
1,919 Views

Hi,


We are working on it and will get back to you soon.


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
1,823 Views

Hi,


Thanks for your patience.


The root cause for this issue is the same as for mpich. The change in the new infrastructure ch4 makes it necessary to implement some features differently.


Could you please try Intel MPI in debug mode and test if the request queues are still available? These can be viewed with the TotalView debugger.


Thanks & Regards

Shivani



0 Kudos
ShivaniK_Intel
Moderator
1,803 Views

Hi,


As we did not hear back from you,could you please respond to my previous post.


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
1,707 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks & Regards

Shivani


0 Kudos
Linaro_Josh
Novice
1,696 Views

Hi @ShivaniK_Intel ,

 

Apologies there appear to be issues with my old Intel account. I can create another ticket if need be, though I imagine it would just be pointing to this one.

 


Could you please try Intel MPI in debug mode and test if the request queues are still available?

I have found the time to try with Intel MPI packaged with OneAPI 2023.2.0 and run with -genv I_MPI_DEBUG=10 (I assumed this is what you meant with in debug mode) but it still appears to display the same issues as before.

 

All the best,

 

Josh

0 Kudos
Reply