- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When evaluating the vtune_amplifier_xe_2015.1.0.367959 on Linux I experienced a kernel oops in the vtune kernel modules. I was trying to run the microarchitecture -> general exploration -> bandwidth test. Centos 7 x86 default install updated with all patches. Code was running on SNB machine with the vtune CLI_install installed as per manual.
(CLI_install has another issues, the RHEL/Centos kernel sources are not in /usr/src/linux, installer does not pick that up automatically)
(Manual notes that power sampler should be installed but I read that it was removed earlier, update docs?)
Any ideas besides it's open source, please submit a patch? :)
code under test
compiled as user_loop (gcc 4.8.2 -g)
int main(void)
{
volatile unsigned long i=0;
while(i<1000000000)
{
++i;
}
return 0;
}
crash summary
KERNEL: /usr/lib/debug/lib/modules/3.10.0-123.el7.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2015.01.04-12:44:17/vmcore [PARTIAL DUMP]
CPUS: 16
DATE: Sun Jan 4 12:43:16 2015
UPTIME: 01:49:45
LOAD AVERAGE: 0.10, 0.07, 0.06
TASKS: 367
RELEASE: 3.10.0-123.el7.x86_64
VERSION: #1 SMP Mon Jun 30 12:09:22 UTC 2014
MEMORY: 32 GB
PANIC: "Oops: 0002 [#1] SMP " (check log for details)
PID: 29144
COMMAND: "user_loop"
TASK: ffff8805fc9571c0 [THREAD_INFO: ffff8805fca2e000]
CPU: 8
STATE: TASK_RUNNING (PANIC)
log:
[ 4291.860357] PAX: PMU arbitration service v1.0.1 has been started.
[ 4292.902500] sep3_15: PMU collection driver v3.15.5 (EMON) has been loaded.
[ 4292.934677] sep3_15: Chipset support is enabled.
[ 4292.956584] sep3_15: IDT vector 0x21 will be used for handling PMU interrupts.
[ 4295.038257] vtss++ kernel module ("v1.4.4-367959 Intel(R) VTune(TM) Amplifier XE 2013") registered
[ 6584.773197] BUG: unable to handle kernel paging request at ffffc900183f2000
[ 6584.805419] IP: [<ffffffffa05adeab>] UNC_COMMON_PCI_Read_Counts+0x6b/0x1b0 [sep3_15]
[ 6584.841380] PGD 42f405067 PUD 83f403067 PMD 2aa331067 PTE 0
[ 6584.867465] Oops: 0002 [#1] SMP
bt
PID: 29144 TASK: ffff8805fc9571c0 CPU: 8 COMMAND: "user_loop"
#0 [ffff8805fca2fa90] machine_kexec at ffffffff81041181
#1 [ffff8805fca2fae8] crash_kexec at ffffffff810cf0e2
#2 [ffff8805fca2fbb8] oops_end at ffffffff815ea548
#3 [ffff8805fca2fbe0] no_context at ffffffff815daf63
#4 [ffff8805fca2fc30] __bad_area_nosemaphore at ffffffff815daff9
#5 [ffff8805fca2fc78] bad_area_nosemaphore at ffffffff815db163
#6 [ffff8805fca2fc88] __do_page_fault at ffffffff815ed36e
#7 [ffff8805fca2fd88] do_page_fault at ffffffff815ed58a
#8 [ffff8805fca2fdb0] page_fault at ffffffff815e97c8
[exception RIP: UNC_COMMON_PCI_Read_Counts+107]
RIP: ffffffffa05adeab RSP: ffff8805fca2fe60 RFLAGS: 00010002
RAX: 0000000000000058 RBX: 0000000000000001 RCX: 0000000000000080
RDX: 0000000000000001 RSI: ffffc900183f1f80 RDI: 0000000000000001
RBP: ffff8805fca2fea8 R8: 0000000000000003 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003f
R13: 0000000000000040 R14: 0000000000000058 R15: ffffc900183f1f80
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#9 [ffff8805fca2feb0] PMI_Interrupt_Handler at ffffffffa05a3b14 [sep3_15]
#10 [ffff8805fca2ff50] SYS_Perfvec_Handler at ffffffffa05b0f85 [sep3_15]
RIP: 000000000040050a RSP: 00007fff61c5d0b0 RFLAGS: 00000206
RAX: 0000000015d95a9f RBX: 0000000000000000 RCX: 0000000000400520
RDX: 00007fff61c5d1a8 RSI: 00007fff61c5d198 RDI: 0000000000000001
RBP: 00007fff61c5d0b0 R8: 00007f15a1e68e80 R9: 0000000000000000
R10: 00007fff61c5cf40 R11: 00007f15a1acea00 R12: 0000000000400400
R13: 00007fff61c5d190 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: 0000000015d95a9f CS: 0033 SS: 002b
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I meant x86_64 obviously
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(I would like to hear if there are similar issues from Centos 7 from others:-) )
I saw some clues from your outputs:
> [ 6584.773197] BUG: unable to handle kernel paging request at ffffc900183f2000
>DUMPFILE: /var/crash/127.0.0.1-2015.01.04-12:44:17/vmcore [PARTIAL DUMP]
1. Did you work on Linux* which was installed on Virtual Machine? If it was the case, please work VTune(TM) Amplifier XE on native Linux*, Linux * on VM is supported by VTune only for VMWare Fusion* 5. If you installed VTune on Linux on other VM, you can only use user-mode sampling collectors.
2. Was it possible that your system was configured with huge memory page? As I knew, there was limitation to install hook functions from device driver.
3. Did you use standard Centos 7? or standard patches?
If you don't have above problems, My opinion is to submit a ticket to Intel Premier, with your data - maybe need your more private data for investigating.
*By the way, Power profiling has been removed from regular VTune, please get this function in VTune which is in Intel(R) System Studio XE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry. It seemed that configuring huge page by hugetlbfs in GRUB parameter is supported by VTune, I tried:
default_hugepagesz=1g hugepagesz=1g hugepages=4 memmap=1G$4G
(This was not supported in old product)
Please verify if all vtune drivers are loaded: lsmod grep [sep|pax|vtsspp], then helped to collect info:
1) export AMPLXE_DEBUG=1
2) export AMPLXE_LOG_LEVEL=TRACE
3) export AMPLXE_LOG_DIR=<dir>
4) in different root console
> while true; dmesg -c >>out_dmesg; done
5) from the first console (environment variables) start collection and reproduce the problem.
Then provide
1) the result directory
2) log files
3) out_dmesg
I will report this to developer with your data. (If you have no time to create a new ticket at Intel Premier)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Dennis,
I had similar problems with vtune recently. The cause turned out to be that it was using an older version of the SEP3 driver with a newer version of VTune. Complete uninstall then reinstall fixed it. This might also be something you want to check.
Cheers,
James
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also note that build "367959" is the initial release of the 2015 version. Update 1, build 380310, was released in Oct/Nov. I recommend you update and use the latest release.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>#3 [ffff8805fca2fbe0] no_context at ffffffff815daf63>>>
It seems that third function call(quoted above) caused kernel panic or called kernel panic routines.I would start to check page fault rate around the time of crash.Maybe low physical memory scenario indirectly caused that crash?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
james B. wrote:
Dear Dennis,
I had similar problems with vtune recently. The cause turned out to be that it was using an older version of the SEP3 driver with a newer version of VTune. Complete uninstall then reinstall fixed it. This might also be something you want to check.
Cheers,
James
Thanks James.
Go /opt/intel/vtune_amplifier_xe_2015/sepdk/src/, run:
a. rmmod-sep3
b. build-driver
c. insmod-sep3
d. boot-script -- install

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page