- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am rewriting part of LibSVM code for vectorization in order to accelerate it on Xeon Phi. Unfortunately, the performance of my rewriting code is the same as, if not slightly worse than, the original performance on Xeon Phi. To figure out the reason, I profiled two programs using vTune hot spot analysis. According to the results, the target code segment takes significantly less time in the rewriting version, however, most of its time goes to __kmp_wait_yield_4, which only occupies a tiny amount of time in the original version.
Does anyone know what it means? I tried to Google it but got very little information.
BTW, my vectorization code gets about 20% improvement in total if running in the host.
Thanks in advance!
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>why do you think y[] is better to be double?
If you use G
if alpha_status is made double too, and UPPER_BOUND==-1.0, LOWER_BOUND==1.0
then test becomes (y != alpha_status)
(verify sign on ..._BOUND)
Though C++ should promote at compile time, please get into the habit of using the same typed literals. Porting to other languages might not be so kind. Example: Fortran, at compile time uses the literals in the type as expressed in the source code. By convention, conversion is performed at run time. Therfore:
someDouble = 0.1
is different from
someDouble = 0.1D+0
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for you guys' help! I finally figure out the cause of soaring __kmp_wait_yield_4 in vTune is from my accidentally nested OMP parallelization. FYI.
I really appreciate your help though, especially Jim's continuous advice on the vectorization issues, which is extremely helpful to me.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »