Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Valgrind memcheck error in libiomp

paul_f
Novice
1,029 Views

Not sure if this is the right forum.

I'm a spare time Valgrind developer, and have also encountered this issue in my day job.

When using Intel OpenMP (from pstudioxe2017) I get a memcheck error

 

 

 

 

==14500== Syscall param sched_setaffinity(mask) points to unaddressable byte(s)
==14500==    at 0x21147E29: syscall (in /usr/lib64/libc-2.17.so)
==14500==    by 0x206EB028: __kmp_affinity_determine_capable (in /path/to/lib/Linux_x86_64/libiomp5.so)

 

 

 

 

This happens during a call to omp_get_num_procs.

 

Using gdb, I see that the assembler looks like this. This is using 'syscall()' rather than glibc 'sched_setaffinity()'

 

 

 

 

│    0x206eb019 <__kmp_affinity_determine_capable+73>        xor    %esi,%esi
│    0x206eb01b <__kmp_affinity_determine_capable+75>        mov    $0xcb,%edi
│    0x206eb020 <__kmp_affinity_determine_capable+80>        xor    %ecx,%ecx
│    0x206eb022 <__kmp_affinity_determine_capable+82>        xor    %eax,%eax
│    0x206eb024 <__kmp_affinity_determine_capable+84>        call   0x20651ae0 <syscall@plt> 

 

 

 

 

The arguments, in order, are

%edi is 0xcb (203), the syscall number.

%esi is the PID, zero

%edx is the length of the mask in bytes, which I see is 640. Normally it's supposed to be sizeof(cpu_set_t) which is 128.

%rcx is the pointer to the mask, and $ecx is set to 0

 

The glibc manpage doesn't mention the use of a NULL mask pointer, so I can't tell if this is some undocumented use of sched_setaffinity that memcheck isn't handling or whether it is a bug in Intel OpenMP.

Looking at the kernel source, I think that the excess map length just gets ignored:

https://elixir.bootlin.com/linux/v4.4/source/kernel/sched/core.c#L4488

 

 

 

 

	else if (len > cpumask_size())
		len = cpumask_size();

 

 

 

 

I'll do some more debugging to see if the syscall is failing.

 

EDIT: the return is -1, so it looks like an Intel OpenMP bug to me.

Labels (2)
0 Kudos
6 Replies
ShivaniK_Intel
Moderator
975 Views

Hi,


Thanks for posting in the Intel forums.


Could you please try the supported version of the Intel oneAPI toolkit and let us know if you face a similar issue?


For more details regarding the supported version please refer to the below link


https://www.intel.com/content/www/us/en/developer/articles/release-notes/intel-parallel-studio-xe-supported-and-unsupported-product-versions.html


Could you please provide us with the below details?


1. OS

2. output of lscpu command

3. Sample reproducer and steps to reproduce the issue


Thanks & Regards

Shivani


0 Kudos
paul_f
Novice
957 Views

I can't easily try other versions. The most recent that we have installed is 2020. Asking for a more recent version to be installed is likely to take months. I can reproduce the problem with 2018 and 2020.4

OS - RHEL 7.9

lscpu - ntel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz


GCC 11.2 built from source

 

(I don't think that any of the above change much)

 

Reproducer iomp_sched.c

#include <omp.h>

int main(void)
{
   (void)omp_get_num_procs();
}

 

Commands:

gcc -fopenmp -c -g iomp_sched.c

gcc -o iomp_sched iomp_sched.o -L/path/to/pstudioxe2018/lib/intel64 -liomp5 -Wl,-rpath,/path/to/pstudioxe2018/lib/intel64

valgrind ./iomp_sched

==29136== Syscall param sched_setaffinity(mask) points to unaddressable byte(s)
==29136==    at 0x4EFFE29: syscall (in /usr/lib64/libc-2.17.so)
==29136==    by 0x4B19197: __kmp_affinity_determine_capable (z_Linux_util.cpp:185)
==29136==    by 0x4AF27E8: __kmp_env_initialize(char const*) (kmp_settings.cpp:5773)
==29136==    by 0x4ADAF9A: __kmp_do_serial_initialize (kmp_runtime.cpp:6964)
==29136==    by 0x4ADAF9A: __kmp_do_middle_initialize (kmp_runtime.cpp:7110)
==29136==    by 0x4ADAF9A: __kmp_middle_initialize (kmp_runtime.cpp:7219)
==29136==    by 0x4ABC60D: omp_get_num_procs@@VERSION (kmp_ftn_entry.h:615)
==29136==    by 0x40113E: main (iomp_sched.c:5)

 

It looks like the source code for this function is available, like here

 

https://github.com/llvm-mirror/openmp/blob/master/runtime/src/z_Linux_util.cpp

 

It looks like it's making a deliberate invalid call to the sched_setaffinity syscall.

0 Kudos
paul_f
Novice
874 Views

I don't think that this has been resolved.

0 Kudos
ShivaniK_Intel
Moderator
849 Views

Hi,


Could you please try the Intel compilers with the supported Intel oneAPI versions and let us know if you face a similar issue?


For more details regarding the supported version please refer to the below link


https://www.intel.com/content/www/us/en/developer/articles/release-notes/intel-parallel-studio-xe-supported-and-unsupported-product-versions.html


We don't support parallel studio versions which you are using currently but if you need you could be a licensed customer for oneAPI and get help from priority support.


Please refer to the below link for priority support 


https://www.intel.com/content/www/us/en/developer/get-help/priority-support.html


Thanks & Regards

Shivani



0 Kudos
paul_f
Novice
816 Views

I don't get any errors with the 2023.1.0 base kit (on an old Xeon, Fedora 37).

0 Kudos
ShivaniK_Intel
Moderator
695 Views

Hi,


As your issue is resolved with the latest version of Intel oneAPI, we are going ahead and closing this thread. This thread will no longer be monitored by Intel. If you need further assistance please post a new question.


Thanks & Regards

Shivani


0 Kudos
Reply