Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4995 Discussions

Thread Profiler hangs/freezes

bob_brandt
Beginner
513 Views
I'm able to instrument and capture an application run, but Thread Profiler hangs on the popup "Select a time range of data to view."

This happens repeatably/reliably in this case I need to profile. However, I have successfully used the tool in prior cases.

Any suggestions?
0 Kudos
7 Replies
bob_brandt
Beginner
513 Views
I noticed this error in the event viewer, too:

Event Type: Error
Event Source: Application Hang
Event Category: (101)
Event ID: 1002
Date: 5/14/2009
Time: 9:46:15 AM
User: N/A
Computer: DEV1269
Description:
Hanging application vtuneenv.exe, version 9.0.10.719, hang module hungapp, version 0.0.0.0, hang address 0x00000000.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 48 61 6e 67 ion Hang
0010: 20 20 76 74 75 6e 65 65 vtunee
0018: 6e 76 2e 65 78 65 20 39 nv.exe 9
0020: 2e 30 2e 31 30 2e 37 31 .0.10.71
0028: 39 20 69 6e 20 68 75 6e 9 in hun
0030: 67 61 70 70 20 30 2e 30 gapp 0.0
0038: 2e 30 2e 30 20 61 74 20 .0.0 at
0040: 6f 66 66 73 65 74 20 30 offset 0
0048: 30 30 30 30 30 30 30 0000000

0 Kudos
Peter_W_Intel
Employee
513 Views
Quoting - bob_brandt
I'm able to instrument and capture an application run, but Thread Profiler hangs on the popup "Select a time range of data to view."

This happens repeatably/reliably in this case I need to profile. However, I have successfully used the tool in prior cases.

Any suggestions?

I was aware of similar problem when running Thread Profiler with big application. Finally I changed setting at "Configure Intel Thread Profiler Instrumentation" dialog -> "Messages / Call Sites" tab -> "Accuracy/Performance Trade-offs:" -> "Record using dbghelp (Windows*)/libunwind(Linux*)" : to a maximum depth of "1"

Original value is "5", I just want to reduce workloads from Thread Profiler.

Other choice is to enlarge value "Record complete information of one event for every [ ] event below threshold" on the tab named "Thresholds" of same dialog.

Regards, Peter
0 Kudos
bob_brandt
Beginner
513 Views

I was aware of similar problem when running Thread Profiler with big application. Finally I changed setting at "Configure Intel Thread Profiler Instrumentation" dialog -> "Messages / Call Sites" tab -> "Accuracy/Performance Trade-offs:" -> "Record using dbghelp (Windows*)/libunwind(Linux*)" : to a maximum depth of "1"

Original value is "5", I just want to reduce workloads from Thread Profiler.

Other choice is to enlarge value "Record complete information of one event for every [ ] event below threshold" on the tab named "Thresholds" of same dialog.

Regards, Peter


Thanks for the advice!

I tried both those suggestions and I was able to avoid the hang, but unfortunately I'm not able to see the source code for the mutex overheads that I need to address. Heisenberg's uncertainty principle applies, I guess.

In this case, though, I was able to infer which mutex was my bottleneck outside of the Profiler tool, so I'll wait to worry about this on my next bottleneck.

0 Kudos
Peter_W_Intel
Employee
513 Views
Quoting - bob_brandt


Thanks for the advice!

I tried both those suggestions and I was able to avoid the hang, but unfortunately I'm not able to see the source code for the mutex overheads that I need to address. Heisenberg's uncertainty principle applies, I guess.

In this case, though, I was able to infer which mutex was my bottleneck outside of the Profiler tool, so I'll wait to worry about this on my next bottleneck.


It's a good news that Thread Profiler doesn't hang again! You metthe secondproblem todrill downto source code for mutex object's overheads! I reviewed Thresholds' behavior : defaultsetting is "Do not record source information" if event duration(number) is less than threshold. But it is better than other option "Do not record event at all".

You said my two suggestions both are workable. I wonder if you can create a new Thread Profiler's activityof using"Record using dbghelp to a maximum depth of: 1" only to avoid this problem?

Regards, Peter
0 Kudos
bob_brandt
Beginner
513 Views

It's a good news that Thread Profiler doesn't hang again! You metthe secondproblem todrill downto source code for mutex object's overheads! I reviewed Thresholds' behavior : defaultsetting is "Do not record source information" if event duration(number) is less than threshold. But it is better than other option "Do not record event at all".

You said my two suggestions both are workable. I wonder if you can create a new Thread Profiler's activityof using"Record using dbghelp to a maximum depth of: 1" only to avoid this problem?

Regards, Peter

Unfortunately, no.

I tried a bunch of combinations (your two, plus different threshold values), and each outcome either had the hang or if it didn't hang it didn't show me the information that I needed.

I think this problem may be magnified in my case because our code base wraps up the mutex calls in a layer or two of calls, so if I really want to know what mutex is hurting my performance I need more than a few stack frames to see it.

0 Kudos
Peter_W_Intel
Employee
513 Views
Quoting - bob_brandt

Unfortunately, no.

I tried a bunch of combinations (your two, plus different threshold values), and each outcome either had the hang or if it didn't hang it didn't show me the information that I needed.

I think this problem may be magnified in my case because our code base wraps up the mutex calls in a layer or two of calls, so if I really want to know what mutex is hurting my performance I need more than a few stack frames to see it.


I have two suggestions:
1) I found your using vtuneenv.exe, version 9.0.10.719- that is version 9. Please try to download latest VTune Analyzer version v9.1 Update 12 build #210, which includes latest Intel? Thread Profiler. See https://registrationcenter.intel.com/RegCenter/Download.aspx?productid=1104

2) Don't use thresholds since you can't find source view.Adjust dbghelp's depth to "2"or "3" sincestandard mutex APIwaswrapped by your code.

By the way, isit possible that you can attach the test casefor investigating?

Thanks, Peter
0 Kudos
bob_brandt
Beginner
513 Views

I have two suggestions:
1) I found your using vtuneenv.exe, version 9.0.10.719- that is version 9. Please try to download latest VTune Analyzer version v9.1 Update 12 build #210, which includes latest Intel? Thread Profiler. See https://registrationcenter.intel.com/RegCenter/Download.aspx?productid=1104

2) Don't use thresholds since you can't find source view.Adjust dbghelp's depth to "2"or "3" sincestandard mutex APIwaswrapped by your code.

By the way, isit possible that you can attach the test casefor investigating?

Thanks, Peter

Hi Peter,

Thanks for all your suggestions.

1) I'm doing a 30-day eval of the Profiler and didn't see any updates from that link you posted. Maybe that's only for licensed users?

2) I'm not sure how to avoid thresholds. I just completed a run where I had these settings:
Threshold Behavior : Do not record event at all
Threshold Values :
Wait Events: 0 microseconds
- Light-weight locks: 3 microseconds
- Blocking API events: 30 microseconds
Automatically adjust threshold values to limit analysis overhead: CHECKED
Record complete information of one event for every X events below threshold: NOT checked

Messages:
Attribute impact time to messages or windows (when available): NOT checked

Call Site Information:
Coverage: Also collect call sites for thread waits

Accuracy/Performance Trade-offs: Record using dbghelp depth=5

The good news is the GUI didn't hang. The bad news is it didn't show very much. Certainly not enough to determine where my bottleneck is. A very few thread transistions were shown. I checked on and I could see the source and stack, but it didn't show the full stack, just the first 2 frames.

Unfortunately this code and application is for work and is proprietary. I can say I'm testing on Windows XP Pro with a four core processor and trying to use four worker threads. I'm seeing a big back-off: in this case a single-thread run takes ~12 seconds and four threads take ~30 seconds.

In a prior case I found big contention on a mutex, and the cost of contention was this expensive. I'd like a tool to show me where my contention is now, but Thread Profiler hasn't been able to do it for me in this case (yet).

0 Kudos
Reply