Embedded Intel Atom® Processors
Intel Atom® Hardware, Software, Firmware, Graphics
996 Discussions

GPU hang detected in Intel Apollo Lake SoC after S3 resume

Arun24
Beginner
356 Views

During one of our QA tests, a GPU hang issue is observed on Intel Apollo Lake SoC based product running on Wind river Linux version 4.8.28-rt16.

We had similar issues during the past for which Intel provided support. The same issue reappeared after months of QA testing.

Attached the kernel, Xorg and GPU log for analysis

We may need some technical support to determine the root cause of this intermittent issue and provide a permanent fix. 

Below is the snapshot of the call trace logged in kernel,

Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933638] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:996
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933640] in_atomic(): 1, irqs_disabled(): 0, pid: 16, name: migration/0
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933649] Preemption disabled at:[<ffffffff81100248>] cpu_stopper_thread+0xa8/0x130
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933650]
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933654] CPU: 0 PID: 16 Comm: migration/0 Tainted: P U O 4.8.28-rt16 #2
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933655] Hardware name: Insyde ApolloLake/B, BIOS 03.07EP 02/03/2021
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933659] 0000000000000000 ffff8801003d7d28 ffffffff8138a026 0000000000000000
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933661] ffff880100478c80 ffff8801003d7d48 ffffffff8109edcc ffff8800763c6720
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933663] ffff8800763d6000 ffff8801003d7d60 ffffffff819176c0 ffff8800763d6000
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933664] Call Trace:
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933672] [<ffffffff8138a026>] dump_stack+0x4f/0x69
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933676] [<ffffffff8109edcc>] ___might_sleep+0xdc/0x150
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933679] [<ffffffff819176c0>] rt_spin_lock+0x20/0x50
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933683] [<ffffffff8151d8bd>] i915_gem_find_active_request+0x1d/0x90
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933687] [<ffffffff815b351d>] capture+0x79d/0x1690
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933690] [<ffffffff810fff01>] ? cpu_stop_queue_work+0x41/0x90
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933692] [<ffffffff8110000b>] multi_cpu_stop+0xbb/0xe0
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933694] [<ffffffff810fff50>] ? cpu_stop_queue_work+0x90/0x90
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933696] [<ffffffff8110024e>] cpu_stopper_thread+0xae/0x130
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933698] [<ffffffff8109a8ce>] ? smpboot_thread_fn+0x2e/0x310
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933700] [<ffffffff8109aa98>] smpboot_thread_fn+0x1f8/0x310
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933702] [<ffffffff8109a8a0>] ? sort_range+0x30/0x30
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933705] [<ffffffff81097ac9>] kthread+0xe9/0x100
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933708] [<ffffffff810979e0>] ? kthread_worker_fn+0x1c0/0x1c0
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933710] [<ffffffff81917c94>] ret_from_fork+0x54/0x60
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.933987] [drm] GPU HANG: ecode 9:0:0x86dffffd, in Xorg [3705], reason: hang on rcs0, action: reset
Apr 28 07:31:20 XRX9C934EA8E229 kernel: [146602.939276] i915 0000:00:02.0: Resetting chip for hang on rcs0
Apr 28 07:32:15 XRX9C934EA8E229 kernel: [146657.652171] set_psbutton_light(3)
Apr 28 07:32:16 XRX9C934EA8E229 kernel: [146658.596986] store_suspend: skip pixcir suspend
Apr 28 07:32:16 XRX9C934EA8E229 kernel: [146658.729363] PM: Syncing filesystems ... done.
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146658.748026] Freezing user space processes ...
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752274] Freezing of tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752791] webkitBrowser D ffffffff8111f993 0 8339 1 0x200a0006
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752800] ffff88010047e400 ffff8801002cf1e8 ffff880048e78000 ffff88002d874b00
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752806] ffff88017fcd6cc0 ffff880035283a40 ffffffff81914240 00ff880100001800
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752811] ffffea000113c080 ffff88002d874b00 ffff88002d874b00 0000000000000000
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752816] Call Trace:
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752828] [<ffffffff81914240>] ? __schedule+0x170/0x430
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752832] [<ffffffff819159d3>] ? __rt_mutex_slowlock+0x83/0x1c0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752834] [<ffffffff8191454c>] schedule+0x4c/0xe0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752837] [<ffffffff819159e1>] __rt_mutex_slowlock+0x91/0x1c0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752841] [<ffffffff81915f55>] rt_mutex_slowlock+0x115/0x2d0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752844] [<ffffffff81916190>] rt_mutex_lock+0x80/0xa0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752846] [<ffffffff81917a7e>] _mutex_lock+0xe/0x10
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752851] [<ffffffff814d5aec>] i915_driver_postclose+0x2c/0x60
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752855] [<ffffffff814ae83d>] drm_release+0x27d/0x3b0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752860] [<ffffffff811b40c7>] __fput+0xb7/0x200
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752863] [<ffffffff811b427e>] ____fput+0xe/0x10
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752867] [<ffffffff81095bd8>] task_work_run+0x88/0xc0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752871] [<ffffffff8107a51a>] do_exit+0x2ba/0xb20
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752874] [<ffffffff813a7aa7>] ? debug_smp_processor_id+0x17/0x20
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752877] [<ffffffff81078d9b>] ? pin_current_cpu+0x7b/0x1d0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752880] [<ffffffff8107be25>] do_group_exit+0x45/0xd0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752883] [<ffffffff810877d7>] get_signal+0x297/0x6e0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752887] [<ffffffff81019d89>] do_signal+0x29/0x650
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752889] [<ffffffff81078f06>] ? unpin_current_cpu+0x16/0x70
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752893] [<ffffffff8109eb82>] ? migrate_enable+0x82/0x150
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752895] [<ffffffff81917877>] ? rt_spin_unlock+0x27/0x40
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752897] [<ffffffff81088618>] ? do_sigtimedwait+0x208/0x260
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752901] [<ffffffff810ffbaa>] ? compat_SyS_rt_sigtimedwait+0x9a/0xe0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752904] [<ffffffff81917a6e>] ? rt_read_unlock+0x1e/0x20
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752929] [<ffffffffa0084819>] ? pst_common_post_handler_child+0x1d9/0x360 [scdrv]
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752933] [<ffffffff810018be>] exit_to_usermode_loop+0xce/0xf0
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752936] [<ffffffff810021a2>] do_int80_syscall_32+0x162/0x170
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.752939] [<ffffffff8191907a>] entry_INT80_compat+0x4a/0x70
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.753021]
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.753023] Restarting tasks ... done.
Apr 28 07:32:36 XRX9C934EA8E229 kernel: [146678.898466] video LNXVIDEO:00: Restoring backlight state

0 Kudos
1 Reply
CarlosAM_INTEL
Moderator
342 Views

Hello, @Arun24:

 

Thank you for contacting Intel Embedded Community.

 

We sent an email address related to this account with suggestions that may help you.

 

Best regards,

@CarlosAM_INTEL

Reply