Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4995 Discussions

VTune Amplifier XE causes system crash/restart

cachecoherent
Beginner
650 Views
We have an SGI UV1000 with 8 core Xeon E7's. 64GB of RAM. Operating system is Redhat Linux 6.0 SE.

We created a simple project to generate a Nehalem general hardware profiling report on the /bin/ls executable (pretty basic test)

When the project is run, our system freezes and reboots. This is not a hanging thread nor a specific process that is not responding - the entire system actually experiences a freeze. After reboot, we checked the results of the VTune run and found no results.

Is there some kernel configuration that we must modify in order to let the architecture-specific hardware profiling work?
0 Kudos
10 Replies
Mark_D_Intel
Employee
650 Views
Some background questions:

- What version of Amplifier XE are you using?
- How many total cores (including HT if on)?
- Are there any error messages in the /var/log/messages log starting with 'SEP3_1' or 'amplxe-runsa'?

(An FYI for the likely cause of the problem: The hardware sampling driver allocates memory and the size of the allocation is determined by the number of cores. It is these memory allocations that can fail on machines with extremely large core counts and cause the OS to freeze.)

Mark
0 Kudos
cachecoherent
Beginner
650 Views
Mark, thanks for the response:

- What version of Amplifier XE are you using?
We are using VTune Amplifier XE for Linux, version 2011 (Update 7) on active support.

- How many total cores (including HT if on)?
We have 32 physical CPUs, 8 cores per CPU, and hyperthreading is disabled.

- Are there any error messages in the /var/log/messages log starting with 'SEP3_1' or 'amplxe-runsa'?
We are checking on this now.

0 Kudos
cachecoherent
Beginner
650 Views
Mark,

There are no error messages with those terms in /var/log/messages.

However, the system console shows that just before the system freezeup, VTune tried to allocate more memory than was available.

Is there any way to (interactively or by configuration) restrict the amount of memory VTune allocates for each core?
0 Kudos
SergeyKostrov
Valued Contributor II
650 Views
>>...Operating system is Redhat Linux 6.0 SE

Is it a32-bit or 64-bit edition? Is it a Server Edition?

>>...However, the system console shows that just before the system freezeup, VTune tried to allocate more
>>memory than was available...

How much memorywasallocated before VTune crashed?
Did you try toincrease a virtual file size?

I could only assume that an incorrect processing is happening in VTune C/C++ codes, like:

...
*p = ( * )malloc( ... ); // or new(...), or calloc(...), etc

// malloc returns NULL because itfailed toallocate some amount of memory
// and processing continues because there is no verification that p is equal to NULL



// and of courseVTune crashes...
...
0 Kudos
cachecoherent
Beginner
650 Views
64-bit. Redhat Linux SE stands for Security Enabled. The system is running in permissive mode, non-virtual.

I will check on memory allocation size before the crash.

I think you misunderstand the issue. The entire system is crashing, not just VTune.

I was hoping a simple configuration change to the memory allocation could be applied as a temporary fix.
0 Kudos
Rob5
New Contributor II
650 Views

This issue is also being worked via case 657396.

- Rob

0 Kudos
SergeyKostrov
Valued Contributor II
650 Views
64-bit. Redhat Linux SE stands for Security Enabled. The system is running in permissive mode, non-virtual.

I will check on memory allocation size before the crash.

[SergeyK] Any details?

I think you misunderstand the issue. The entire system is crashing, not just VTune.

[SergeyK] I understood the problem completely and I explained why it happens. Another possible
reasonthat VTune corrupts an operating system stack after its memory requestfailed.
After that OS crashes.

I was hoping a simple configuration change to the memory allocation could be applied as a temporary fix.

[SergeyK] If you install more RAM that could help.


Best regards,
Sergey

0 Kudos
cachecoherent
Beginner
650 Views
Memory allocation size before the crash was shown as something small, around 2GB, much smaller than system RAM. That is the last message in the log, however. Probably not the last operation that occurred.

We have run the latest patch of VTUNE (Dec 20 2011) and the crash still occurs.

Now when we experience a crash, the system does not log a memory allocation. Instead it goes straight into kernel panic.

The system has 2TB RAM in total (64GB per processor, 32 processors). We will not be installing any more RAM. We would expect that 2 Terabytes of RAM is sufficient for the execution of VTUNE Amplifier on /bin/ls.

As per Intel's recommendation, we ran the same Nehalem General Exploration test on the Tachyon sample application that ships with VTUNE Amplifier XE 2011 and found the same result.

I have attached the uvcon log from our machine showing the contents of the kernel panic. Hope that helps.

0 Kudos
LM11
Beginner
650 Views
Attached is the result from the Amplifier Feedback reporting tool:

amplxe-feedback.exe --create-bug-report=report.txt

0 Kudos
Aitcomputing_G_
Beginner
650 Views

Hi Mark,

Is the following statement still valid for VTune Amplifier XE 2018u2?

We are having this issue on VTune Amplifier XE 2016, and Intel support's suggestion was to try 2018u2.

We are seeing "SEP3_1" in the /var/log/messages log.

mark-dewing (Intel) wrote:

Some background questions:

- What version of Amplifier XE are you using?
- How many total cores (including HT if on)?
- Are there any error messages in the /var/log/messages log starting with 'SEP3_1' or 'amplxe-runsa'?

(An FYI for the likely cause of the problem: The hardware sampling driver allocates memory and the size of the allocation is determined by the number of cores. It is these memory allocations that can fail on machines with extremely large core counts and cause the OS to freeze.)

Mark

Julia

 

0 Kudos
Reply