- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, User tries to run memory analysis and compute node immediately panics when job starts. srun amplxe-cl -collect memory-access -knob analyze-mem-objects=true -knob analyze-openmp=true ./Elmfire-Dev
(it's a hybdir MPI/OpenMP app, resource manager is Slurm)
BUG: unable to handle kernel paging request at 000000000000100c IP: [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0] PGD 1017096067 PUD 1017095067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_cur_freq CPU 1 Modules linked in: vtsspp(U) sep4_0(U) socperf2_0(U) pax(U) lmv(U) fld(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic crc32c_intel libcfs(U) cpufreq_ondemand freq_table pcc_cpufreq rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) configfs ib_uverbs(U) ib_umad(U) mlx5_ib(U) mlx5_core(U) mlx4_en(U) ipmi_devintf iTCO_wdt iTCO_vendor_support power_meter acpi_ipmi ipmi_si ipmi_msghandler serio_raw sg sb_edac edac_core i2c_i801 lpc_ich mfd_core hpilo hpwdt ioatdma igb dca i2c_algo_bit i2c_core ptp pps_core ib_ipoib(U) ib_cm(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) ib_netlink(U) ipv6 mlx4_core(U) mlx_compat(U) ext4 jbd2 mbcache sd_mod crc_t10dif hpsa(U) scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 4790, comm: amplxe-runss Not tainted 2.6.32-642.15.1.el6.x86_64 #1 HP ProLiant XL230a Gen9/ProLiant XL230a Gen9 RIP: 0010:[<ffffffffa0c898a6>] [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0] RSP: 0018:ffff88101968b838 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88101602e440 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000000000000c0 RDI: ffff88101602e440 RBP: ffff88101968b858 R08: 0000000000000000 R09: 00000000000011d6 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00000000000000c0 R15: ffff88101968b8b8 FS: 00007ff33a385700(0000) GS:ffff880028220000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000000100c CR3: 0000001017093000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process amplxe-runss (pid: 4790, threadinfo ffff881019688000, task ffff88101d5a6040) Stack: ffff88101968b858 ffff88101602e458 0000000000000000 00000000000000c0 <d> ffff88101968b898 ffffffffa0c89a62 00000000000002f8 ffff88101968b8b8 <d> 0000000000000003 ffff88101968b908 00000000000012b6 ffff88101968bb39 Call Trace: [<ffffffffa0c89a62>] OUTPUT_Module_Fill+0x52/0x90 [sep4_0] [<ffffffffa0c88544>] linuxos_Load_Image_Notify_Routine+0x174/0x220 [sep4_0] [<ffffffffa0c886fe>] linuxos_VMA_For_Process+0x10e/0x1a0 [sep4_0] [<ffffffff810097cc>] ? __switch_to+0x1ac/0x340 [<ffffffffa0c887f4>] linuxos_Enum_Modules_For_Process+0x64/0xc0 [sep4_0] [<ffffffffa0c888ba>] linuxos_Exit_Task_Notify+0x6a/0x70 [sep4_0] [<ffffffff8154f385>] notifier_call_chain+0x55/0x80 [<ffffffff810acf2a>] __blocking_notifier_call_chain+0x5a/0x80 [<ffffffff810acf66>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff810b0e8a>] profile_task_exit+0x1a/0x20 [<ffffffff8108175b>] do_exit+0x2b/0x870 [<ffffffff81081ff8>] do_group_exit+0x58/0xd0 [<ffffffff81097e06>] get_signal_to_deliver+0x1f6/0x460 [<ffffffff8100a285>] do_signal+0x75/0x870 [<ffffffff810abc82>] ? hrtimer_cancel+0x22/0x30 [<ffffffff8154b2b3>] ? do_nanosleep+0x93/0xc0 [<ffffffff810abd54>] ? hrtimer_nanosleep+0xc4/0x180 [<ffffffff810bd99b>] ? sys_futex+0x7b/0x170 [<ffffffff8100ab10>] do_notify_resume+0x90/0xc0 [<ffffffff8100b3a1>] int_signal+0x12/0x17 Code: 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 0f 1f 44 00 00 48 8b 05 a8 16 01 00 48 89 fb 41 89 d5 <44> 8b 80 0c 10 00 00 45 85 c0 0f 85 3a 01 00 00 8b 43 1c 39 f0 RIP [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0] RSP <ffff88101968b838> CR2: 000000000000100c BUG: unable to handle kernel ---[ end trace c34e52112c8c3565 ]---
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
A problem with similar stack was addressed at the end of last year. Can you please check the 2017 Update 2 release of VTune Amplifier XE (build #499904)? If the problem persists - please submit Premier Support issue with details on the system HW/OS and kernel patches if any.
Regards, Katya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Katya,
If I am running into similar issues on VTune Amplifier XE 2016 (not 2017), what are the steps to resolve this issue?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page