- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all, I encountered a very weird behavior. I ran the following codes:
t0 = get_time();
clEnqueueWriteBuffer(queue, mem, CL_FALSE, 0, 1.8*GB, host, 0, NULL, NULL);
printf("%lf secs", get_time() - t0);
The evaluation system has 4 Intel Xeon Phi 5110p coprocessors. (with Intel OpenCL runtime 14.2 and MPSS 3.4.2)
When I ran the code using MPI, that is 4 MPI-task, each task showed about 0.0000x secs.
But when I ran the code using threads, such as 4 OpenMP threads, it showed about 5 secs. Even though it is a enqueuing a non-blocking command.
Do you have any idea?
Thanks.
Jungwon
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jungwon,
I would like to ask you the following information.
1. Small reproducer program.
2. Details of the OpenMP/MPI setup.
3. What is the severity of this issue for you? Is it critical or minor one?
Thanks,
Yuri

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page