- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I recently got an i5-11500, and linux is crashing about once a day. The motherboard is ASUS Prime H570-PLUS, BIOS version 0820. CPU microcode is at 0x3c. Details below, has anybody else seen anything like this? Any idea what the BERT errors mean?
During early boot, I often see dmesg lines like:
[ 4.455507] BERT: Error records from previous boot:
[ 4.460399] [Hardware Error]: event severity: fatal
[ 4.465280] [Hardware Error]: Error 0, type: fatal
[ 4.470160] [Hardware Error]: section_type: Firmware Error Record Reference
[ 4.477296] [Hardware Error]: Firmware Error Record Type: SOC Firmware Error Record Type2
[ 4.485645] [Hardware Error]: Revision: 2
[ 4.489825] [Hardware Error]: Record Identifier: 8f87f311-c998-4d9e-a0c4-6065518c4f6d
[ 4.497831] [Hardware Error]: 00000000: 1100c101 00000080 00000000 fe013d78 ............x=..
[ 4.506527] [Hardware Error]: 00000010: 00000000 3cfbf3f9 00000111 0500020e .......<........
[ 4.515225] [Hardware Error]: 00000020: 3e013335 0000c000 00000000 07000000 53.>............
[ 4.523922] [Hardware Error]: 00000030: 0079f3f9 00000211 0500020e 3e013641 ..y.........A6.>
[ 4.532622] [Hardware Error]: 00000040: 0000c000 00000000 06000000 3c5bf3f1 ..............[<
[ 4.541317] [Hardware Error]: 00000050: 00000311 0500020e 3e01394a 000000e0 ........J9.>....
[ 4.550015] [Hardware Error]: 00000060: 01002009 3e0141b6 0000200c 3e01877f . ...A.>. .....>
[ 4.558711] [Hardware Error]: 00000070: 0000200d 3e0650e6 00002012 3e072e3a . ...P.>. ..:..>
[ 4.567410] [Hardware Error]: 00000080: 00002606 3e07595c 0000260e 3e075a6b .&..\Y.>.&..kZ.>
[ 4.576106] [Hardware Error]: 00000090: 0000260f 3e075b57 0000260e 3e075c57 .&..W[.>.&..W\.>
[ 4.584804] [Hardware Error]: 000000a0: 0000260f 3e075d43 00002610 3e075e53 .&..C].>.&..S^.>
[ 4.593500] [Hardware Error]: 000000b0: 0000260e 3e075ff4 0000260f 3e0760e0 .&..._.>.&...`.>
[ 4.602198] [Hardware Error]: 000000c0: 0000260e 3e076257 0000260f 3e076343 .&..Wb.>.&..Cc.>
[ 4.610896] [Hardware Error]: 000000d0: 0000260e 3e076443 0000260f 3e07652f .&..Cd.>.&../e.>
[ 4.619593] [Hardware Error]: 000000e0: 00002610 3e07663f 0000260e 3e0767e0 .&..?f.>.&...g.>
[ 4.628289] [Hardware Error]: 000000f0: 0000260f 3e0768cc 0000260e 3e076a43 .&...h.>.&..Cj.>
[ 4.636986] [Hardware Error]: 00000100: 0000260f 3e076b2f 0000260e 3e076c2f .&../k.>.&../l.>
[ 4.645683] [Hardware Error]: 00000110: 0000260f 3e076d1b 00002610 3e076e2b .&...m.>.&..+n.>
[ 4.654379] [Hardware Error]: 00000120: 0000260e 3e076fcc 0000260f 3e0770b8 .&...o.>.&...p.>
[ 4.663075] [Hardware Error]: 00000130: 0000260e 3e07722f 0000260f 3e07731b .&../r.>.&...s.>
[ 4.671772] [Hardware Error]: 00000140: 0000260e 3e07741b 0000260f 3e077507 .&...t.>.&...u.>
[ 4.680469] [Hardware Error]: 00000150: 00002610 3e077617 0000260e 3e0777b8 .&...v.>.&...w.>
[ 4.689168] [Hardware Error]: 00000160: 0000260f 3e0778a4 00002607 3e0779ff .&...x.>.&...y.>
[ 4.697861] [Hardware Error]: 00000170: 00002013 3e077c69 0000030d 0500020e . ..i|.>........
[ 4.706559] [Hardware Error]: 00000180: 3c91aa05 0000200b 3c91b30f 00002706 ...<. .....<.'..
[ 4.715255] [Hardware Error]: 00000190: 3c91bd10 00002707 3c91be25 00002010 ...<.'..%..<. ..
[ 4.723952] [Hardware Error]: 000001a0: 3c936550 00002011 3c936642 0000c000 Pe.<. ..Bf.<....
[ 4.732648] [Hardware Error]: 000001b0: 00000000 07000000 3c5bf039 00000011 ........9.[<....
[ 4.741344] [Hardware Error]: 000001c0: 0500020e 3c9503f0 00002106 3c9517ae .......<.!.....<
[ 4.750045] [Hardware Error]: 000001d0: 00002107 3c9518b3 00000800 01002203 .!.....<....."..
[ 4.758740] [Hardware Error]: 000001e0: 3d637481 0000200e 3dfe2301 0000200f .tc=. ...#.=. ..
[ 4.767437] [Hardware Error]: 000001f0: 3dfe243b 0000c000 00000000 07c40800 ;$.=............
[ 4.776133] [Hardware Error]: Error 1, type: fatal
[ 4.781015] [Hardware Error]: section_type: Firmware Error Record Reference
[ 4.788150] [Hardware Error]: Firmware Error Record Type: SOC Firmware Error Record Type2
[ 4.796492] [Hardware Error]: Revision: 2
[ 4.800682] [Hardware Error]: Record Identifier: 8f87f311-c998-4d9e-a0c4-6065518c4f6d
[ 4.808685] [Hardware Error]: 00000000: 03028001 00030004 00d3c72d 0000043c ........-...<...
[ 4.817384] [Hardware Error]: 00000010: 000301ff 0000003c 17dac638 27000040 ....<...8...@..'
[ 4.826079] [Hardware Error]: 00000020: 0001004c 00012f5a 000ba1ea 000ba23d L...Z/......=...
[ 4.834778] [Hardware Error]: 00000030: 000ba2ab 000ba22d 000ba22d 0000e5ae ....-...-.......
[ 4.843475] [Hardware Error]: 00000040: 0007053a 00000000 00000000 00000000 :...............
[ 4.852171] [Hardware Error]: 00000050: 00000000 00000000 00000000 00000000 ................
[ 4.860869] [Hardware Error]: 00000060: 00000004 00000000 07edb6c0 0000262e .............&..
[ 4.869563] [Hardware Error]: 00000070: 07edb6c0 0000262e 07edb6c0 0000262e .....&.......&..
[ 4.878260] [Hardware Error]: 00000080: 00000000 00000000 00000000 00000000 ................
[ 4.886957] [Hardware Error]: 00000090: 00000000 170107f4 130267f4 00000000 .........g......
[ 4.895654] [Hardware Error]: 000000a0: 00000000 f20004a1 f20004a1 f20004a1 ................
[ 4.904351] [Hardware Error]: 000000b0: f200734e 150347f4 deadbeef 130107f4 Ns...G..........
[ 4.913046] [Hardware Error]: 000000c0: deadbeef 00000000 00000000 2c180400 ...............,
[ 4.921744] [Hardware Error]: 000000d0: 2c180400 2c180400 2c180400 150347f4 ...,...,...,.G..
[ 4.930442] [Hardware Error]: 000000e0: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.939137] [Hardware Error]: 000000f0: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 4.947836] [Hardware Error]: 00000100: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 4.956531] [Hardware Error]: 00000110: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.965220] [Hardware Error]: 00000120: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 4.973919] [Hardware Error]: 00000130: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 4.982616] [Hardware Error]: 00000140: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.991314] [Hardware Error]: 00000150: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 5.000010] [Hardware Error]: 00000160: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 5.008705] [Hardware Error]: 00000170: 423c2801 0000661f 000000c0 000000c0 .(<B.f..........
[ 5.017404] [Hardware Error]: 00000180: 000000c0 00000090 76eafbe0 00006697 ...........v.f..
[ 5.026103] [Hardware Error]: 00000190: 0000001c 0000001c 0000001c 000003e0 ................
[ 5.034799] [Hardware Error]: 000001a0: 16459f01 00006883 28000000 28000000 ..E..h.....(...(
[ 5.043498] [Hardware Error]: 000001b0: 28000000 28280000 7deb1b93 000068c4 ...(..((...}.h..
[ 5.052193] [Hardware Error]: 000001c0: 7f0f4100 7f0f4100 7f0f4100 7f0f4100 .A...A...A...A..
[ 5.060893] [Hardware Error]: 000001d0: 38bb72e5 00006893 c06075bc c060803c .r.8.h...u`.<.`.
[ 5.069587] [Hardware Error]: 000001e0: c06075bc c060803c 9a7f7cf5 000068da .u`.<.`..|...h..
[ 5.078284] [Hardware Error]: 000001f0: 00000000 00000000 00000000 00000000 ................
[ 5.086981] [Hardware Error]: 00000200: 00000000 000a0671 80000000 00000000 ....q...........
[ 5.095680] [Hardware Error]: 00000210: 50000703 50000703 50000703 50000703 ...P...P...P...P
[ 5.104375] [Hardware Error]: 00000220: 10800303 1100ff07 1100ff07 10800303 ................
[ 5.113071] [Hardware Error]: 00000230: 00000000 10000000 00000000 00000000 ................
[ 5.121770] [Hardware Error]: 00000240: 00000000 00000000 00000000 00000000 ................
[ 5.130464] [Hardware Error]: 00000250: 00000000 00000000 00000000 883a0000 ..............:.
[ 5.139161] [Hardware Error]: 00000260: 00000000 00000000 00000000 00000000 ................
[ 5.147860] [Hardware Error]: 00000270: 00000000 00000000 00000000 00000000 ................
[ 5.156556] [Hardware Error]: 00000280: 00000000 00000000 00000000 00000000 ................
[ 5.165251] [Hardware Error]: 00000290: 00000000 00000000 00000000 00040000 ................
[ 5.173949] [Hardware Error]: 000002a0: f1811b00 08080838 00000000 000000c0 ....8...........
[ 5.182645] [Hardware Error]: 000002b0: 883b0000 883a0000 88410000 883f0000 ..;...:...A...?.
[ 5.191343] [Hardware Error]: 000002c0: 88400000 88410000 88400000 08000000 ..@...A...@.....
[ 5.200038] [Hardware Error]: 000002d0: 08000000 c0030000 c0030000 00030703 ................
[ 5.208734] [Hardware Error]: 000002e0: 40030000 00030703 00030703 00030703 ...@............
[ 5.217431] [Hardware Error]: 000002f0: 00030703 00030703 00030703 00030703 ................
[ 5.226130] [Hardware Error]: 00000300: 00030703 00030303 0003ff07 0003ff07 ................
[ 5.234824] [Hardware Error]: 00000310: 0003ff07 0003ff07 0003ff07 07000000 ................
[ 5.243524] [Hardware Error]: 00000320: deadbeef deadbeef 00000000 deadbeef ................
[ 5.252222] [Hardware Error]: 00000330: deadbeef 00a01330 024ac812 00a51ee9 ....0.....J.....
[ 5.260919] [Hardware Error]: 00000340: 0c006003 00000000 00000000 00000000 .`..............
[ 5.269617] [Hardware Error]: 00000350: 00000000 01f8af92 00000000 00000000 ................
[ 5.278313] [Hardware Error]: 00000360: 00000000 fe0001dc 00000000 00002011 ............. ..
[ 5.287011] [Hardware Error]: 00000370: 0000fbff 00000000 deadbeef deadbeef ................
[ 5.295707] [Hardware Error]: 00000380: c0000034 00000000 00000100 00000000 4...............
[ 5.304405] [Hardware Error]: 00000390: 60000000 00000000 00000000 00000011 ...`............
[ 5.313100] [Hardware Error]: 000003a0: 0000ffff 03030000 00000000 00000000 ................
[ 5.321797] [Hardware Error]: 000003b0: 00000000 03030303 00000104 03000003 ................
[ 5.330494] [Hardware Error]: 000003c0: 00000000 00000000 00000000 03030303 ................
[ 5.339192] [Hardware Error]: 000003d0: 00000105 00000000 00046172 00000000 ........ra......
[ 5.347886] [Hardware Error]: 000003e0: 00000000 00000000 00000000 00000000 ................
[ 5.356582] [Hardware Error]: 000003f0: 00000000 00000000 00000000 00000000 ................
[ 5.365279] [Hardware Error]: 00000400: 00000000 00000000 00000000 00000000 ................
[ 5.373970] [Hardware Error]: 00000410: 00000000 00000000 00000000 00000000 ................
[ 5.382666] [Hardware Error]: 00000420: 00000000 00000000 00000000 00000000 ................
[ 5.391360] [Hardware Error]: 00000430: 00000000 00000001 00000000 40000001 ...............@
[ 5.400058] [Hardware Error]: 00000440: 00000000 3c000000 00000000 00000000 .......<........
[ 5.408753] [Hardware Error]: 00000450: 80000086 00000038 fef00040 00000000 ....8...@.......
[ 5.417449] [Hardware Error]: 00000460: 40000001 00000000 3c000000 00000000 ...@.......<....
[ 5.426148] [Hardware Error]: 00000470: 00000000 80000086 00000078 fef00000 ........x.......
[ 5.434843] [Hardware Error]: 00000480: 00000000 40000001 00000000 3c000000 .......@.......<
[ 5.443540] [Hardware Error]: 00000490: 00000000 00000000 80000086 00000038 ............8...
[ 5.452237] [Hardware Error]: 000004a0: fef00240 00000000 40000001 00000000 @..........@....
[ 5.460932] [Hardware Error]: 000004b0: 3c000000 00000000 00000000 80000086 ...<............
[ 5.469630] [Hardware Error]: 000004c0: 00000078 fef00200 00000000 40000001 x..............@
[ 5.478326] [Hardware Error]: 000004d0: 00000000 3c000000 00000000 00000000 .......<........
[ 5.487023] [Hardware Error]: 000004e0: 00000000 00000000 00000000 00000000 ................
[ 5.495719] [Hardware Error]: 000004f0: 40000001 00000000 3c000000 00000000 ...@.......<....
[ 5.504413] [Hardware Error]: 00000500: 00000000 00000000 00000000 00000000 ................
[ 5.513113] [Hardware Error]: 00000510: 00000000 00000000 00000000 00000000 ................
[ 5.521810] [Hardware Error]: 00000520: 00000000 00200000 00000000 00000000 ...... .........
[ 5.530504] [Hardware Error]: 00000530: 00000000 00000000 00000000 00000000 ................
[ 5.539202] [Hardware Error]: 00000540: 00000000 00000000 00200000 00000000 .......... .....
[ 5.547899] [Hardware Error]: 00000550: 00000000 00000000 00000000 deadbeef ................
[ 5.556595] [Hardware Error]: 00000560: deadbeef 03000f43 deadbeef 00000000 ....C...........
[ 5.565292] [Hardware Error]: 00000570: 00000000 03000f43 deadbeef deadbeef ....C...........
[ 5.573990] [Hardware Error]: 00000580: 21000625 deadbeef 00000100 00000100 %..!............
[ 5.582686] [Hardware Error]: 00000590: 03000f43 deadbeef deadbeef 00000000 C...............
[ 5.591384] [Hardware Error]: 000005a0: deadbeef 00000000 00000000 03000f43 ............C...
[ 5.600080] [Hardware Error]: 000005b0: deadbeef deadbeef 03000f43 deadbeef ........C.......
[ 5.608778] [Hardware Error]: 000005c0: 00000000 00000000 03000f43 deadbeef ........C.......
[ 5.617474] [Hardware Error]: 000005d0: deadbeef 21000625 deadbeef 00001f00 ....%..!........
[ 5.626172] [Hardware Error]: 000005e0: 00000000 03000f43 deadbeef deadbeef ....C...........
[ 5.634867] [Hardware Error]: 000005f0: 00000000 deadbeef 00000000 00000000 ................
[ 5.643566] [Hardware Error]: 00000600: 03000f43 deadbeef deadbeef 03000f43 C...........C...
[ 5.652262] [Hardware Error]: 00000610: deadbeef 00000000 00000000 03000f43 ............C...
[ 5.660959] [Hardware Error]: 00000620: deadbeef deadbeef 21000605 deadbeef ...........!....
[ 5.669658] [Hardware Error]: 00000630: 00000000 00000000 03000f43 deadbeef ........C.......
[ 5.678352] [Hardware Error]: 00000640: deadbeef 00000000 deadbeef 00000000 ................
[ 5.687050] [Hardware Error]: 00000650: 00000000 21000615 deadbeef deadbeef .......!........
[ 5.695746] [Hardware Error]: 00000660: 03000f43 deadbeef 00000000 00000000 C...............
[ 5.704439] [Hardware Error]: 00000670: 00000000 deadbeef deadbeef 21000625 ............%..!
[ 5.713136] [Hardware Error]: 00000680: deadbeef 00000000 00000172 21000625 ........r...%..!
[ 5.721833] [Hardware Error]: 00000690: deadbeef deadbeef 00000000 deadbeef ................
[ 5.730529] [Hardware Error]: 000006a0: 00100000 00900000 00000000 deadbeef ................
[ 5.739224] [Hardware Error]: 000006b0: deadbeef deadbeef 00000006 00000061 ............a...
[ 5.747921] [Hardware Error]: 000006c0: 21000645 deadbeef deadbeef deadbeef E..!............
[ 5.756618] [Hardware Error]: 000006d0: 00000000 00000000 00000000 deadbeef ................
[ 5.765315] [Hardware Error]: 000006e0: deadbeef deadbeef 00000000 20000000 ...............
[ 5.774011] [Hardware Error]: 000006f0: 21000615 deadbeef deadbeef deadbeef ...!............
[ 5.782708] [Hardware Error]: 00000700: 00000000 00000000 00000000 deadbeef ................
[ 5.791403] [Hardware Error]: 00000710: deadbeef deadbeef 00000000 21000615 ...............!
[ 5.800098] [Hardware Error]: 00000720: deadbeef deadbeef deadbeef 00000000 ................
[ 5.808795] [Hardware Error]: 00000730: deadbeef deadbeef 21000615 deadbeef ...........!....
[ 5.817493] [Hardware Error]: 00000740: deadbeef 00000000 deadbeef deadbeef ................
[ 5.826189] [Hardware Error]: 00000750: 21000635 deadbeef deadbeef 00000000 5..!............
[ 5.834886] [Hardware Error]: 00000760: deadbeef deadbeef 21000605 deadbeef ...........!....
[ 5.843579] [Hardware Error]: 00000770: deadbeef 00000000 deadbeef deadbeef ................
[ 5.852276] [Hardware Error]: 00000780: deadbeef deadbeef deadbeef deadbeef ................
[ 5.860974] [Hardware Error]: 00000790: deadbeef deadbeef deadbeef deadbeef ................
[ 5.869670] [Hardware Error]: 000007a0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.878366] [Hardware Error]: 000007b0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.887063] [Hardware Error]: 000007c0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.895760] [Hardware Error]: 000007d0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.904456] [Hardware Error]: 000007e0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.913154] [Hardware Error]: 000007f0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.921852] [Hardware Error]: 00000800: deadbeef deadbeef 00200172 deadbeef ........r. .....
[ 5.930550] [Hardware Error]: 00000810: 17980800 17980800 deadbeef 17980800 ................
[ 5.939247] [Hardware Error]: 00000820: 17980800 deadbeef 17980800 17980800 ................
[ 5.947943] [Hardware Error]: 00000830: deadbeef 17980800 17980800 deadbeef ................
[ 5.956639] [Hardware Error]: 00000840: 17980800 deadbeef 17980800 deadbeef ................
[ 5.965339] [Hardware Error]: 00000850: 17980800 deadbeef 17980800 deadbeef ................
[ 5.974035] [Hardware Error]: 00000860: deadbeef deadbeef deadbeef deadbeef ................
[ 5.982730] [Hardware Error]: 00000870: deadbeef deadbeef deadbeef 00000000 ................
[ 5.991428] [Hardware Error]: 00000880: 00000000 00000000 00000000 00000000 ................
[ 6.000125] [Hardware Error]: 00000890: 00000000 00000000 00000000 00000000 ................
[ 6.008822] [Hardware Error]: 000008a0: 00000000 00000000 00000000 0401e003 ................
[ 6.017519] [Hardware Error]: 000008b0: 00000198 2a2f0322 000072d6 0000003c ...."./*.r..<...
[ 6.026214] [Hardware Error]: 000008c0: 00000003 00000003 00000003 80050033 ............3...
[ 6.034911] [Hardware Error]: 000008d0: 00770ee0 979ab000 00007f3a bf7d6005 ..w.....:....`}.
[ 6.043608] [Hardware Error]: 000008e0: 00000002 00000d01 002b0033 00000000 ........3.+.....
[ 6.052306] [Hardware Error]: 000008f0: 00000000 00000202 00000000 b2801313 ................
[ 6.061000] [Hardware Error]: 00000900: 00007f3a 00000000 00000000 00000000 :...............
[ 6.069700] [Hardware Error]: 00000910: 00000000 00000000 00000000 00000000 ................
[ 6.078397] [Hardware Error]: 00000920: 00000000 00000000 00000000 00000000 ................
[ 6.087092] [Hardware Error]: 00000930: 00000000 00000000 00000000 00000000 ................
[ 6.095791] [Hardware Error]: 00000940: 00000000 00000000 00000000 00000000 ................
[ 6.104489] [Hardware Error]: 00000950: 00000000 00000000 00000000 00000000 ................
[ 6.113186] [Hardware Error]: 00000960: 00000000 00000000 00000000 00000000 ................
[ 6.121882] [Hardware Error]: 00000970: 00000000 00000000 00000000 00000000 ................
[ 6.130578] [Hardware Error]: 00000980: 00000000 00000000 00000000 00000000 ................
[ 6.139279] [Hardware Error]: 00000990: 00000000 00000000 00000000 00000000 ................
[ 6.147971] [Hardware Error]: 000009a0: 00000000 00000000 00000000 00000000 ................
[ 6.156668] [Hardware Error]: 000009b0: 00000000 00000000 00000000 00000000 ................
[ 6.165366] [Hardware Error]: 000009c0: 00000000 00000fff 00000000 00000000 ................
[ 6.174066] [Hardware Error]: 000009d0: 00000000 00000000 00000000 00000000 ................
[ 6.182761] [Hardware Error]: 000009e0: 00000000 00000001 00000000 00000000 ................
[ 6.191458] [Hardware Error]: 000009f0: 00000000 00000000 00000000 00000000 ................
[ 6.200155] [Hardware Error]: 00000a00: 00000000 00000007 00000000 00000000 ................
[ 6.208852] [Hardware Error]: 00000a10: 00000000 00000000 00000000 00000000 ................
[ 6.217546] [Hardware Error]: 00000a20: 00000000 0000003f 00000000 00800400 ....?...........
[ 6.226244] [Hardware Error]: 00000a30: be000000 b2801313 00007f3a b2801313 ........:.......
[ 6.234941] [Hardware Error]: 00000a40: 00007f3a 0401e003 02000198 2a65e5d2 :.............e*
[ 6.243637] [Hardware Error]: 00000a50: 000072d6 0000003c 00000005 00000002 .r..<...........
[ 6.252334] [Hardware Error]: 00000a60: 0000000b 80050033 00770ee0 599a8000 ....3.....w....Y
[ 6.261029] [Hardware Error]: 00000a70: 00007efc 1360a004 00000004 00000d01 .~....`.........
[ 6.269725] [Hardware Error]: 00000a80: 00180010 00000000 00000000 00000046 ............F...
[ 6.278421] [Hardware Error]: 00000a90: 00000000 a6b4782b ffffffff 00000000 ....+x..........
[ 6.287120] [Hardware Error]: 00000aa0: 00000000 00000000 00000000 00000000 ................
[ 6.295816] [Hardware Error]: 00000ab0: 00000000 00000000 00000000 00000000 ................
[ 6.304514] [Hardware Error]: 00000ac0: 00000000 00000000 00000000 00000000 ................
[ 6.313210] [Hardware Error]: 00000ad0: 00000000 00000000 00000000 00000000 ................
[ 6.321905] [Hardware Error]: 00000ae0: 00000000 00000000 00000000 00000000 ................
[ 6.330604] [Hardware Error]: 00000af0: 00000000 00000000 00000000 00000000 ................
[ 6.339297] [Hardware Error]: 00000b00: 00000000 00000000 00000000 00000000 ................
[ 6.347996] [Hardware Error]: 00000b10: 00000000 00000000 00000000 00000000 ................
[ 6.356690] [Hardware Error]: 00000b20: 00000000 00000000 00000000 00000000 ................
[ 6.365388] [Hardware Error]: 00000b30: 00000000 00000000 00000000 00000000 ................
[ 6.374084] [Hardware Error]: 00000b40: 00000000 00000000 00000000 00000000 ................
[ 6.382779] [Hardware Error]: 00000b50: 00000000 00000000 00000000 00000fff ................
[ 6.391473] [Hardware Error]: 00000b60: 00000000 00000000 00000000 00000000 ................
[ 6.400171] [Hardware Error]: 00000b70: 00000000 00000000 00000000 00000001 ................
[ 6.408867] [Hardware Error]: 00000b80: 00000000 00000000 00000000 00000000 ................
[ 6.417562] [Hardware Error]: 00000b90: 00000000 00000000 00000000 00000007 ................
[ 6.426258] [Hardware Error]: 00000ba0: 00000000 00000000 00000000 00000000 ................
[ 6.434955] [Hardware Error]: 00000bb0: 00000000 00000000 00000000 0000003f ............?...
[ 6.443650] [Hardware Error]: 00000bc0: 00000000 00800400 be000000 b2801313 ................
[ 6.452348] [Hardware Error]: 00000bd0: 00007f3a b2801313 00007f3a 00001400 :.......:.......
[ 6.461041] [Hardware Error]: 00000be0: 00000094 00003180 00002b80 00003180 .....1...+...1..
[ 6.469741] [Hardware Error]: 00000bf0: 00002b88 00003180 00002b8c 00003180 .+...1...+...1..
[ 6.478436] [Hardware Error]: 00000c00: 00002b84 00003180 00002b8a 00002fcb .+...1...+.../..
[ 6.487134] [Hardware Error]: 00000c10: 00001786 00003180 00002b94 00003180 .....1...+...1..
[ 6.495829] [Hardware Error]: 00000c20: 00002b96 000043a7 00000790 00003180 .+...C.......1..
[ 6.504525] [Hardware Error]: 00000c30: 00002b98 00002a1c 000007f6 00002d4b .+...*......K-..
[ 6.513223] [Hardware Error]: 00000c40: 00002810 00002d4b 0000280a 00002d4b .(..K-...(..K-..
[ 6.521919] [Hardware Error]: 00000c50: 00002812 000029e7 000001c6 00002616 .(...).......&..
[ 6.530615] [Hardware Error]: 00000c60: 00003bc1 00003180 00002b83 000043a7 .;...1...+...C..
[ 6.539318] [Hardware Error]: 00000c70: 00000789 00003180 00002b8b 00003180 .....1...+...1..
[ 6.548008] [Hardware Error]: 00000c80: 00002b85 00003180 00002b93 00003180 .+...1...+...1..
[ 6.556706] [Hardware Error]: 00000c90: 00002b8d 00003180 00002b95 00003e49 .+...1...+..I>..
[ 6.565402] [Hardware Error]: 00000ca0: 0000038f 00003180 00002b97 00002a1c .....1...+...*..
[ 6.574100] [Hardware Error]: 00000cb0: 000007f7 0000337d 00003233 0000337d ....}3..32..}3..
[ 6.582797] [Hardware Error]: 00000cc0: 00003235 00002eb8 00003857 00003187 52......W8...1..
[ 6.591491] [Hardware Error]: 00000cd0: 00001683 0000293a 00003e85 01400080 ....:)...>....@.
[ 6.600189] [Hardware Error]: 00000ce0: 01400280 018c5150 018c4440 0188095c ..@.PQ..@D..\...
[ 6.608885] [Hardware Error]: 00000cf0: 01880468 00020090 00020d40 00820150 h.......@...P...
[ 6.617583] [Hardware Error]: 00000d00: 00820d40 00020090 00020d40 00020150 @.......@...P...
[ 6.626279] [Hardware Error]: 00000d10: 00020840 00020018 00020d40 00020018 @.......@.......
[ 6.634975] [Hardware Error]: 00000d20: 00020d40 00020150 00020840 00020018 @...P...@.......
[ 6.643672] [Hardware Error]: 00000d30: 00020d40 00020018 00020d40 00020018 @.......@.......
[ 6.652366] [Hardware Error]: 00000d40: 00020d40 00020150 00020d40 00020018 @...P...@.......
[ 6.661065] [Hardware Error]: 00000d50: 00020d40 010c3890 010c0460 00036010 @....8..`....`..
[ 6.669760] [Hardware Error]: 00000d60: 00034540 0188195c 01880468 00020090 @E..\...h.......
[ 6.678457] [Hardware Error]: 00000d70: 00020840 00020150 00020d40 00820150 @...P...@...P...
[ 6.687155] [Hardware Error]: 00000d80: 00820d40 00020018 00020d40 00020150 @.......@...P...
[ 6.695850] [Hardware Error]: 00000d90: 00020d40 00020018 00020d40 00020150 @.......@...P...
[ 6.704546] [Hardware Error]: 00000da0: 00020840 00020018 00020d40 00020018 @.......@.......
[ 6.713245] [Hardware Error]: 00000db0: 00020d40 00020018 00020d40 00020018 @.......@.......
[ 6.721940] [Hardware Error]: 00000dc0: 00020d40 00020018 00020d40 00020090 @.......@.......
[ 6.730636] [Hardware Error]: 00000dd0: 00020840 00020090 00020d40 00000000 @.......@.......
[ 6.739332] [Hardware Error]: 00000de0: 00000000 00000000 00000000 00000000 ................
[ 6.748029] [Hardware Error]: 00000df0: 00000000 00000000 00000000 00000000 ................
[ 6.756725] [Hardware Error]: 00000e00: 00000000 00000000 00000000 00000000 ................
[ 6.765424] [Hardware Error]: 00000e10: 00000000 00000000 00000000 00000000 ................
[ 6.774120] [Hardware Error]: 00000e20: 00000000 00000000 00000000 00000000 ................
[ 6.782815] [Hardware Error]: 00000e30: 00000000 00000000 00000000 00000000 ................
[ 6.791513] [Hardware Error]: 00000e40: 00000000 00000000 00000000 00000000 ................
[ 6.800210] [Hardware Error]: 00000e50: 00000000 00000000 00000000 00000000 ................
[ 6.808910] [Hardware Error]: 00000e60: 00000000 00000000 00000000 00000000 ................
[ 6.817604] [Hardware Error]: 00000e70: 00000000 00000000 00000000 00000000 ................
[ 6.826301] [Hardware Error]: 00000e80: 00000000 00000000 00000000 00000000 ................
[ 6.834996] [Hardware Error]: 00000e90: 00000000 00000000 00000000 00000000 ................
[ 6.843693] [Hardware Error]: 00000ea0: 00000000 00000000 00000000 00000000 ................
[ 6.852389] [Hardware Error]: 00000eb0: 00000000 00000000 00000000 00000000 ................
[ 6.861083] [Hardware Error]: 00000ec0: 00000000 00000000 00000000 00000000 ................
[ 6.869778] [Hardware Error]: 00000ed0: 00000000 00000000 00000000 00000000 ................
[ 6.878473] [Hardware Error]: 00000ee0: 00000000 00000000 00000000 00000000 ................
[ 6.887170] [Hardware Error]: 00000ef0: 00000000 00000000 00000000 00000000 ................
[ 6.895866] [Hardware Error]: 00000f00: 00000000 00000000 00000000 00000000 ................
[ 6.904562] [Hardware Error]: 00000f10: 00000000 00000000 00000000 00000000 ................
[ 6.913256] [Hardware Error]: 00000f20: 00000000 00000000 00000000 00000000 ................
[ 6.921951] [Hardware Error]: 00000f30: 00000000 00000000 00000000 00000000 ................
[ 6.930648] [Hardware Error]: 00000f40: 00000000 00000000 00000000 00000000 ................
[ 6.939343] [Hardware Error]: 00000f50: 00000000 00000000 00000000 00000000 ................
[ 6.948038] [Hardware Error]: 00000f60: 00000000 00000000 00000000 00000000 ................
[ 6.956735] [Hardware Error]: 00000f70: 00000000 00000000 00000000 00000000 ................
[ 6.965430] [Hardware Error]: 00000f80: 00000000 00000000 00000000 00000000 ................
[ 6.974128] [Hardware Error]: 00000f90: 00000000 00000000 00000000 00000000 ................
[ 6.982824] [Hardware Error]: 00000fa0: 00000000 00000000 00000000 00000000 ................
[ 6.991521] [Hardware Error]: 00000fb0: 00000000 00000000 00000000 00000000 ................
[ 7.000216] [Hardware Error]: 00000fc0: 00000000 00000000 00000000 00000000 ................
[ 7.008912] [Hardware Error]: 00000fd0: 00000000 00000000 00000000 00000000 ................
[ 7.017608] [Hardware Error]: 00000fe0: 00000000 00000000 00000000 00000000 ................
[ 7.026306] [Hardware Error]: 00000ff0: 00000000 00000000 00000000 00000000 ................
And I've seen three different types of crashes so far:
[ 1845.677713] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1845.683869] rcu: 7-...0: (14 ticks this GP) idle=dfa/1/0x4000000000000000 softirq=62174/62174 fqs=1288
[ 1845.693669] (detected by 3, t=5252 jiffies, g=239605, q=1208)
[ 1845.699733] Sending NMI from CPU 3 to CPUs 7:
[ 1848.352177] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[ 1848.352177] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[ 1849.386424] Shutting down cpus with NMI
[ 1849.386425] Kernel Offset: 0x36200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[46536.264594] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[46536.264596] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[46537.298843] Shutting down cpus with NMI
[46537.298843] Kernel Offset: 0x25600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 5936.397108] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
[ 5936.397110] Modules linked in: ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E) jfs(E) xfs(E) overlay(E) bridge(E) stp(E) intel_rapl_msr(E) llc(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_sof_pci_intel_tgl(E) ghash_clmulni_intel(E) binfmt_misc(E) snd_sof_intel_hda_common(E) soundwire_intel(E) soundwire_generic_allocation(E) soundwire_cadence(E) snd_sof_intel_hda(E) snd_sof_pci(E) snd_sof_xtensa_dsp(E) snd_sof(E) snd_soc_hdac_hda(E) snd_hda_ext_core(E) aesni_intel(E) snd_hda_codec_hdmi(E) snd_soc_acpi_intel_match(E) libaes(E) snd_soc_acpi(E) nft_masq(E) crypto_simd(E) snd_soc_core(E) cryptd(E) snd_hda_codec_realtek(E) snd_compress(E) intel_cstate(E) intel_uncore(E) soundwire_bus(E) snd_hda_codec_generic(E) eeepc_wmi(E) ledtrig_audio(E) pcspkr(E) uvcvideo(E) asus_wmi(E) snd_hda_intel(E) battery(E) snd_intel_dspcfg(E) videobuf2_vmalloc(E) nft_chain_nat(E) snd_intel_sdw_acpi(E) sparse_keymap(E) videobuf2_memops(E) efi_pstore(E) rfkill(E) nf_nat(E)
[ 5936.397127] snd_usb_audio(E) wmi_bmof(E) iTCO_wdt(E) snd_hda_codec(E) videobuf2_v4l2(E) intel_pmc_bxt(E) snd_usbmidi_lib(E) videobuf2_common(E) snd_hda_core(E) snd_rawmidi(E) iTCO_vendor_support(E) nft_ct(E) ee1004(E) watchdog(E) nls_ascii(E) snd_seq_device(E) snd_hwdep(E) nls_cp437(E) snd_pcm(E) nf_conntrack(E) videodev(E) snd_timer(E) vfat(E) nf_defrag_ipv6(E) fat(E) nf_defrag_ipv4(E) mc(E) joydev(E) snd(E) soundcore(E) mei_me(E) sg(E) mei(E) evdev(E) intel_pmc_core(E) acpi_tad(E) acpi_pad(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) nf_tables(E) nfnetlink(E) fuse(E) configfs(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) sd_mod(E) cdrom(E) dm_mod(E) ahci(E) i915(E) libahci(E)
[ 5936.397149] nvme(E) i2c_algo_bit(E) xhci_pci(E) nvme_core(E) e1000e(E) drm_kms_helper(E) t10_pi(E) crc_t10dif(E) crc32_pclmul(E) ptp(E) xhci_hcd(E) crct10dif_generic(E) intel_lpss_pci(E) crc32c_intel(E) cec(E) i2c_i801(E) libata(E) crct10dif_pclmul(E) pps_core(E) intel_lpss(E) i2c_smbus(E) scsi_mod(E) usbcore(E) idma64(E) crct10dif_common(E) drm(E) fan(E) video(E) wmi(E) button(E)
[ 5936.397157] CPU: 2 PID: 5604 Comm: WRRende~ckend#1 Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5936.397158] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5936.397158] RIP: 0010:native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5936.397159] Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 0f 85 11 01 00 00 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 ba 00 02 00 00
[ 5936.397159] RSP: 0000:ffffacf78177fd38 EFLAGS: 00000002
[ 5936.397160] RAX: 00000000002c0101 RBX: ffffacf78177fd98 RCX: 0000000000000000
[ 5936.397160] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9951495f7450
[ 5936.397161] RBP: fffff5ae13112240 R08: ffff99605f0a8900 R09: 0000000000000000
[ 5936.397161] R10: 0000000000000011 R11: 0000000000000100 R12: 0000000000000246
[ 5936.397162] R13: 0000000000000000 R14: fff0000000000fff R15: fffff5ae131e02c0
[ 5936.397162] FS: 00007f301fafd700(0000) GS:ffff99605f080000(0000) knlGS:0000000000000000
[ 5936.397163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5936.397163] CR2: 00007f300a1bc000 CR3: 0000000257cfc003 CR4: 0000000000770ee0
[ 5936.397163] PKRU: 55555554
[ 5936.397164] Call Trace:
[ 5936.397164] _raw_spin_lock_irqsave+0x32/0x40
[ 5936.397164] lock_page_lruvec_irqsave+0x52/0x80
[ 5936.397165] __pagevec_lru_add+0x1db/0x3e0
[ 5936.397165] ? mem_cgroup_charge_statistics.constprop.0+0x21/0x50
[ 5936.397165] lru_cache_add+0x5c/0x70
[ 5936.397166] __handle_mm_fault+0xd00/0x17b0
[ 5936.397166] handle_mm_fault+0xd5/0x2b0
[ 5936.397166] do_user_addr_fault+0x1ba/0x670
[ 5936.397167] exc_page_fault+0x7b/0x160
[ 5936.397167] ? asm_exc_page_fault+0x8/0x30
[ 5936.397167] asm_exc_page_fault+0x1e/0x30
[ 5936.397168] RIP: 0033:0x7f304668aa22
[ 5936.397168] Code: fe 70 01 00 00 c7 04 3a 00 00 00 00 89 5c 3a 04 44 89 64 3a 08 0f 28 84 24 f0 02 00 00 0f 28 8c 24 00 03 00 00 0f 11 44 3a 18 <0f> 11 4c 3a 28 48 8b ac 24 10 03 00 00 48 89 6c 3a 38 48 c7 44 3a
[ 5936.397169] RSP: 002b:00007f301faf4fb0 EFLAGS: 00010202
[ 5936.397169] RAX: 00007f301ffc6b90 RBX: 000000000000002c RCX: 0000000000000000
[ 5936.397170] RDX: 00007f300a1bb000 RSI: 000000000000000b RDI: 0000000000000fd0
[ 5936.397170] RBP: 00007f301faf5570 R08: 000000003f800000 R09: 00000000000003ff
[ 5936.397171] R10: 000000000000000b R11: 00007f2fa6803688 R12: 0000000000000019
[ 5936.397171] R13: 00007f301ffc69b0 R14: 0000000000000000 R15: 0000000000000001
[ 5936.397172] Kernel panic - not syncing: Hard LOCKUP
[ 5937.439044] Shutting down cpus with NMI
[ 5937.439045] Kernel Offset: 0x38a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 5937.439046] CPU: 2 PID: 5604 Comm: WRRende~ckend#1 Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5937.439046] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5937.439047] Call Trace:
[ 5937.439047] <NMI>
[ 5937.439047] dump_stack+0x76/0x94
[ 5937.439047] panic+0x13b/0x2d7
[ 5937.439048] nmi_panic.cold+0xc/0xc
[ 5937.439048] watchdog_overflow_callback.cold+0x7c/0x7e
[ 5937.439048] __perf_event_overflow+0x83/0x1c0
[ 5937.439049] handle_pmi_common+0x205/0x2e0
[ 5937.439049] intel_pmu_handle_irq+0xec/0x310
[ 5937.439049] perf_event_nmi_handler+0x28/0x50
[ 5937.439050] nmi_handle+0x58/0x100
[ 5937.439050] default_do_nmi+0x42/0x130
[ 5937.439050] exc_nmi+0x12f/0x150
[ 5937.439051] end_repeat_nmi+0x16/0x55
[ 5937.439051] RIP: 0010:native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439052] Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 0f 85 11 01 00 00 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 ba 00 02 00 00
[ 5937.439052] RSP: 0000:ffffacf78177fd38 EFLAGS: 00000002
[ 5937.439053] RAX: 00000000002c0101 RBX: ffffacf78177fd98 RCX: 0000000000000000
[ 5937.439053] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9951495f7450
[ 5937.439054] RBP: fffff5ae13112240 R08: ffff99605f0a8900 R09: 0000000000000000
[ 5937.439054] R10: 0000000000000011 R11: 0000000000000100 R12: 0000000000000246
[ 5937.439055] R13: 0000000000000000 R14: fff0000000000fff R15: fffff5ae131e02c0
[ 5937.439055] ? native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439056] ? native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439056] </NMI>
[ 5937.439056] _raw_spin_lock_irqsave+0x32/0x40
[ 5937.439057] lock_page_lruvec_irqsave+0x52/0x80
[ 5937.439057] __pagevec_lru_add+0x1db/0x3e0
[ 5937.439057] ? mem_cgroup_charge_statistics.constprop.0+0x21/0x50
[ 5937.439058] lru_cache_add+0x5c/0x70
[ 5937.439058] __handle_mm_fault+0xd00/0x17b0
[ 5937.439058] handle_mm_fault+0xd5/0x2b0
[ 5937.439059] do_user_addr_fault+0x1ba/0x670
[ 5937.439059] exc_page_fault+0x7b/0x160
[ 5937.439059] ? asm_exc_page_fault+0x8/0x30
[ 5937.439060] asm_exc_page_fault+0x1e/0x30
[ 5937.439060] RIP: 0033:0x7f304668aa22
[ 5937.439060] Code: fe 70 01 00 00 c7 04 3a 00 00 00 00 89 5c 3a 04 44 89 64 3a 08 0f 28 84 24 f0 02 00 00 0f 28 8c 24 00 03 00 00 0f 11 44 3a 18 <0f> 11 4c 3a 28 48 8b ac 24 10 03 00 00 48 89 6c 3a 38 48 c7 44 3a
[ 5937.439061] RSP: 002b:00007f301faf4fb0 EFLAGS: 00010202
[ 5937.439062] RAX: 00007f301ffc6b90 RBX: 000000000000002c RCX: 0000000000000000
[ 5937.439062] RDX: 00007f300a1bb000 RSI: 000000000000000b RDI: 0000000000000fd0
[ 5937.439063] RBP: 00007f301faf5570 R08: 000000003f800000 R09: 00000000000003ff
[ 5937.439063] R10: 000000000000000b R11: 00007f2fa6803688 R12: 0000000000000019
[ 5937.439064] R13: 00007f301ffc69b0 R14: 0000000000000000 R15: 0000000000000001
[ 5937.439064] NMI watchdog: Watchdog detected hard LOCKUP on cpu 8
[ 5937.439065] Modules linked in: ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E) jfs(E) xfs(E) overlay(E) bridge(E) stp(E) intel_rapl_msr(E) llc(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_sof_pci_intel_tgl(E) ghash_clmulni_intel(E) binfmt_misc(E) snd_sof_intel_hda_common(E) soundwire_intel(E) soundwire_generic_allocation(E) soundwire_cadence(E) snd_sof_intel_hda(E) snd_sof_pci(E) snd_sof_xtensa_dsp(E) snd_sof(E) snd_soc_hdac_hda(E) snd_hda_ext_core(E) aesni_intel(E) snd_hda_codec_hdmi(E) snd_soc_acpi_intel_match(E) libaes(E) snd_soc_acpi(E) nft_masq(E) crypto_simd(E) snd_soc_core(E) cryptd(E) snd_hda_codec_realtek(E) snd_compress(E) intel_cstate(E) intel_uncore(E) soundwire_bus(E) snd_hda_codec_generic(E) eeepc_wmi(E) ledtrig_audio(E) pcspkr(E) uvcvideo(E) asus_wmi(E) snd_hda_intel(E) battery(E) snd_intel_dspcfg(E) videobuf2_vmalloc(E) nft_chain_nat(E) snd_intel_sdw_acpi(E) sparse_keymap(E) videobuf2_memops(E) efi_pstore(E) rfkill(E) nf_nat(E)
[ 5937.439082] snd_usb_audio(E) wmi_bmof(E) iTCO_wdt(E) snd_hda_codec(E) videobuf2_v4l2(E) intel_pmc_bxt(E) snd_usbmidi_lib(E) videobuf2_common(E) snd_hda_core(E) snd_rawmidi(E) iTCO_vendor_support(E) nft_ct(E) ee1004(E) watchdog(E) nls_ascii(E) snd_seq_device(E) snd_hwdep(E) nls_cp437(E) snd_pcm(E) nf_conntrack(E) videodev(E) snd_timer(E) vfat(E) nf_defrag_ipv6(E) fat(E) nf_defrag_ipv4(E) mc(E) joydev(E) snd(E) soundcore(E) mei_me(E) sg(E) mei(E) evdev(E) intel_pmc_core(E) acpi_tad(E) acpi_pad(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) nf_tables(E) nfnetlink(E) fuse(E) configfs(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) sd_mod(E) cdrom(E) dm_mod(E) ahci(E) i915(E) libahci(E)
[ 5937.439103] nvme(E) i2c_algo_bit(E) xhci_pci(E) nvme_core(E) e1000e(E) drm_kms_helper(E) t10_pi(E) crc_t10dif(E) crc32_pclmul(E) ptp(E) xhci_hcd(E) crct10dif_generic(E) intel_lpss_pci(E) crc32c_intel(E) cec(E) i2c_i801(E) libata(E) crct10dif_pclmul(E) pps_core(E) intel_lpss(E) i2c_smbus(E) scsi_mod(E) usbcore(E) idma64(E) crct10dif_common(E) drm(E) fan(E) video(E) wmi(E) button(E)
[ 5937.439111] CPU: 8 PID: 7339 Comm: flacparse23:sin Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5937.439112] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5937.439112] RIP: 0010:native_queued_spin_lock_slowpath+0x19f/0x1d0
[ 5937.439113] Code: c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 40 cd 02 00 48 03 04 f5 00 39 b8 ba 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 32 48 85 f6 0f 84 67 ff ff ff 0f 0d 0e
[ 5937.439114] RSP: 0018:ffffacf780a17b68 EFLAGS: 00000046
[ 5937.439114] RAX: 0000000000000000 RBX: ffffacf780a17bc8 RCX: 0000000000240000
[ 5937.439115] RDX: ffff99605f22cd40 RSI: 0000000000000005 RDI: ffff9951495f7450
[ 5937.439115] RBP: fffff5ae0442a7c0 R08: 0000000000240000 R09: 0000000000000000
[ 5937.439116] R10: 0000000000000008 R11: 0000000000000100 R12: 0000000000000246
[ 5937.439116] R13: 0000000000000000 R14: ffff9953c77ded18 R15: fffff5ae0dd1e9c0
[ 5937.439117] FS: 00007f3c3d2e7700(0000) GS:ffff99605f200000(0000) knlGS:0000000000000000
[ 5937.439117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5937.439117] CR2: 00007f2fa7d29000 CR3: 00000001f7f28005 CR4: 0000000000770ee0
[ 5937.439118] PKRU: 55555554
[ 5937.439118] Call Trace:
[ 5937.439118] _raw_spin_lock_irqsave+0x32/0x40
[ 5937.439119] lock_page_lruvec_irqsave+0x52/0x80
[ 5937.439119] __pagevec_lru_add+0x1db/0x3e0
[ 5937.439119] ? __add_to_page_cache_locked+0x19e/0x3b0
[ 5937.439120] lru_cache_add+0x5c/0x70
[ 5937.439120] add_to_page_cache_lru+0x72/0xc0
[ 5937.439120] page_cache_ra_unbounded+0x14d/0x230
[ 5937.439121] ? xas_load+0x5/0x70
[ 5937.439121] filemap_get_pages+0x209/0x5d0
[ 5937.439121] filemap_read+0xa7/0x350
[ 5937.439122] new_sync_read+0x115/0x1a0
[ 5937.439122] vfs_read+0xf4/0x180
[ 5937.439122] ksys_read+0x5f/0xe0
[ 5937.439122] do_syscall_64+0x33/0x80
[ 5937.439123] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 5937.439123] RIP: 0033:0x7f3c86ed108c
[ 5937.439124] Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 89 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 bf fc ff ff 48
[ 5937.439124] RSP: 002b:00007f3c3d2e65e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 5937.439125] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3c86ed108c
[ 5937.439126] RDX: 0000000000010000 RSI: 00007f3c2c0ea7e0 RDI: 0000000000000018
[ 5937.439126] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f3c83d919b0
[ 5937.439126] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
[ 5937.439127] R13: 00007f3c3d2e7680 R14: 0000000000010000 R15: 0000000003080ed0
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello dseomn
Thank you for posting on the Intel️® communities.
Please provide more information for this request:
- Which troubleshooting steps did you try already?
- What RAM do you use on this system? (Part number)
- Does this happen with a different operating system?
- Do you have the chance to test the processor on another system?
Regards,
David G
Intel Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ran MemTest86 version 9.0 for 4 full passes (about 48 hours), with no errors. Then I ran it again a second time, but I think I cancelled it after 1-2 passes (still no errors). I tried to get it to crash predictably by running stress-ng, but that didn't cause a crash. I tried a bunch of combinations of kernel parameters, but so far none of them have been stable for more than about 2 days. I also wanted to try out https://www.intel.com/content/www/us/en/support/articles/000005567/processors.html but it didn't list support for my processor.
It has 4 sticks (2 packs) of "G.SKILL Ripjaws V Series 32GB (2 x 16GB) 288-Pin DDR4 SDRAM DDR4 2133 (PC4 17000) Intel Z170 Platform / Intel X99 Platform Desktop Memory Model F4-2133C15D-32GVR". Note that the ram is DDR4-2133, but https://ark.intel.com/content/www/us/en/ark/products/212277/intel-core-i5-11500-processor-12m-cache-up-to-4-60-ghz.html says the processor supports DDR4-3200. Is it possible that using slower RAM than the processor supports would cause a problem? I thought I ruled that out with MemTest86, but I'm not sure.
I don't have a Windows license, and I'd rather not pay for one just for this. Some people mentioned that it might be possible to get a shell in the Windows installer without a license though. Is that worth trying?
I don't have any other motherboards that are compatible with this processor.
Also, I have never overclocked any of these components.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You don't have an old Windows 7, Windows 8 or Windows 8.1 license? You do know that you can still use them to install Windows 10 from scratch for free, right?
Just saying,
...S
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nope, newest Windows license I have is for XP, but I'm using that for a virtual machine, and I'd also be a bit surprised if it were even possible to install Windows XP on 2021 hardware. I might be able to dig up a Windows ME or 98 license, or DOS 5, but those are even less likely to work or be useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's been stable for three and a half days now with some power saving features disabled by the "processor.max_cstate=0 intel_idle.max_cstate=0" flags:
dseomn@solaria:~$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.12.0-rc8-dseomn root=/dev/mapper/solaria--vg--ssd-root ro i915.force_probe=4c8a log_buf_len=1M nmi_watchdog=panic,1 sysctl.kernel.sysrq=1 console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash
dseomn@solaria:~$ uptime
12:44:07 up 3 days, 13:47, 2 users, load average: 0,51, 0,82, 1,00
For a 65W processor, that's really more of a workaround than a fix. And it looks like some other Intel processors have had similar issues in the past: https://en.wikipedia.org/wiki/Silvermont#Erratum. Is there any way to tell if this is a kernel bug, a microcode bug, a bad processor, or something else?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Shoot, it just crashed again, after about 32 hours, with the kernel command line and log output below.
BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel intel_idle.states_off=0xfffffffc splash
[114080.948951] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[114080.955184] rcu: 7-...0: (1 GPs behind) idle=91e/1/0x4000000000000000 softirq=2730890/2730891 fqs=1660
[114080.965050] (detected by 6, t=5256 jiffies, g=7246105, q=609)
[114080.971171] Sending NMI from CPU 6 to CPUs 7:
[114083.984914] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[114085.017995] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[114085.017995] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[114085.017995] Shutting down cpus with NMI
[114085.017996] Kernel Offset: 0xc200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
From reading https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html and https://www.kernel.org/doc/html/latest/admin-guide/pm/intel_idle.html it looks like intel_idle.states_off is recommended over the two max_cstate parameters, and I think a value of 0xfffffffc disables the same c-states as the previous parameters. (I was able to confirm from /sys/devices/system/cpu/cpu0/cpuidle/state0/default_status and similar files that only state0 and state1 were enabled with either "intel_idle.states_off=0xfffffffc" or "processor.max_cstate=0 intel_idle.max_cstate=0".) Since I also changed to the latest drm-tip kernel (because the i915 driver doesn't work well on any stable kernel yet) and removed a few other parameters, I'll try just changing the c-state parameters this time.
In case it helps anybody else looking into something similar, I found "Supporting 11th Generation Intel® Core™ Processor Families for Desktop Platform, formerly known as Rocket Lake" volume 1 section 4.2.2 "Low-Power Idle States" from https://www.intel.com/content/www/us/en/products/docs/processors/core/core-technical-resources.html really useful for understanding what this specific processor is actually doing at the various c-states.
Also, I've noticed that logical CPUs 1 and 7 seem to be mentioned in most of the crashes. https://www.kernel.org/doc/Documentation/x86/topology.txt says "Many BIOSes enumerate all threads 0 first and then all threads 1" which would indicate that logical CPUs 1 and 7 (on a 1-socket 6-core 12-thread system) are in the same physical core. Is there any reason some things would tend to run on those logical CPUs more than others, or is that indicative of a defective core?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yup, looks like logical CPUs 1 and 7 are on the same physical core:
dseomn@solaria:~$ cat /sys/devices/system/cpu/cpu1/topology/core_id
1
dseomn@solaria:~$ cat /sys/devices/system/cpu/cpu7/topology/core_id
1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Somebody suggested that I look at interrupt affinity as another possible explanation for why logical CPUs 1 and 7 seem to be implicated in almost all the crashes. I have no idea if interrupt affinity is stable across reboots, but the current interrupt info is below. It looks like CPU 1 is getting acpi, mei_me, and ahci interrupts more than most other CPUs, and CPU 7 is getting ttyS0 (serial console) and eno1 (ethernet port) more than most others. No idea if any of those are relevant.
dseomn@solaria:~$ uptime 23:06:24 up 21:48, 3 users, load average: 2,77, 3,00, 3,13 dseomn@solaria:~$ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash dseomn@solaria:~$ cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 0: 42 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 2-edge timer 4: 0 0 262 0 0 0 0 2134 0 0 0 0 IR-IO-APIC 4-edge ttyS0 8: 0 0 0 0 0 0 1 0 0 0 0 0 IR-IO-APIC 8-edge rtc0 9: 9 7 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 9-fasteoi acpi 14: 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 14-fasteoi INT34C6:00 16: 0 0 0 0 0 0 0 0 8 0 0 0 IR-IO-APIC 16-fasteoi i801_smbus 27: 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 27-fasteoi idma64.0, i2c_designware.0 120: 0 0 0 0 0 0 0 0 0 0 0 0 DMAR-MSI 0-edge dmar0 121: 0 0 0 0 0 0 0 0 0 0 0 0 DMAR-MSI 1-edge dmar1 122: 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 98304-edge aerdrv, pcie-dpc 125: 0 43 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 360448-edge mei_me 126: 3441094 0 0 0 0 0 0 0 0 0 0 129 IR-PCI-MSI 327680-edge xhci_hcd 127: 46561 71271 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 376832-edge ahci[0000:00:17.0] 128: 0 0 0 16331 0 0 0 0 0 0 0 0 IR-PCI-MSI 514048-edge snd_hda_intel:card0 129: 0 0 0 0 0 0 72 0 0 0 0 0 IR-PCI-MSI 524288-edge nvme0q0 130: 67083 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 524289-edge nvme0q1 131: 0 57170 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 524290-edge nvme0q2 132: 0 0 67648 0 0 0 0 0 0 0 0 0 IR-PCI-MSI 524291-edge nvme0q3 133: 0 0 0 58026 0 0 0 0 0 0 0 0 IR-PCI-MSI 524292-edge nvme0q4 134: 0 0 0 0 68120 0 0 0 0 0 0 0 IR-PCI-MSI 524293-edge nvme0q5 135: 0 0 0 0 0 57513 0 0 0 0 0 0 IR-PCI-MSI 524294-edge nvme0q6 136: 0 0 0 0 0 0 51393 0 0 0 0 0 IR-PCI-MSI 524295-edge nvme0q7 137: 0 0 0 0 0 0 0 54215 0 0 0 0 IR-PCI-MSI 524296-edge nvme0q8 138: 0 0 0 0 0 0 0 0 58442 0 0 0 IR-PCI-MSI 524297-edge nvme0q9 139: 0 0 0 0 0 0 0 0 0 51665 0 0 IR-PCI-MSI 524298-edge nvme0q10 140: 0 0 0 0 0 0 0 0 0 0 56664 0 IR-PCI-MSI 524299-edge nvme0q11 141: 4039954 0 0 2985 0 0 0 0 0 0 0 0 IR-PCI-MSI 32768-edge i915 142: 0 0 0 0 0 0 0 0 0 0 0 57640 IR-PCI-MSI 524300-edge nvme0q12 143: 0 0 0 0 0 0 0 1234705 0 0 0 0 IR-PCI-MSI 520192-edge eno1 NMI: 6 377 394 382 384 372 399 340 351 355 351 333 Non-maskable interrupts LOC: 4818648 3542369 3281426 3160353 3089317 2956860 2984196 4211558 3131966 2857497 2852176 2815124 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 6 377 394 382 384 372 399 340 351 355 351 333 Performance monitoring interrupts IWI: 1988693 18746 18117 17887 17481 14464 15398 9936 11229 10753 10456 10016 IRQ work interrupts RTR: 0 0 0 0 0 0 0 0 0 0 0 0 APIC ICR read retries RES: 315954 320882 121135 113525 104175 104204 104778 157055 115602 97898 107991 96309 Rescheduling interrupts CAL: 4848398 4956536 4709393 4778520 4837508 4771240 4914706 4439478 4504739 4622628 4577628 4500259 Function call interrupts TLB: 4136148 4424857 4370445 4490762 4563282 4514497 4649105 4190002 4245703 4359975 4319764 4248345 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 0 0 0 0 Threshold APIC interrupts DFR: 0 0 0 0 0 0 0 0 0 0 0 0 Deferred Error APIC interrupts MCE: 0 0 0 0 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 253 253 253 253 253 253 253 253 253 253 253 253 Machine check polls ERR: 0 MIS: 0 PIN: 0 0 0 0 0 0 0 0 0 0 0 0 Posted-interrupt notification event NPI: 0 0 0 0 0 0 0 0 0 0 0 0 Nested posted-interrupt event PIW: 0 0 0 0 0 0 0 0 0 0 0 0 Posted-interrupt wakeup event
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It crashed again after 45 hours, with the max_cstate parameters that previously worked for three and a half days. I now suspect that previous time was just random chance and it would have crashed eventually if I let it run longer. I think my next step is to disable logical CPUs 1 and 7.
BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash
[161632.882221] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [161632.888423] rcu: 7-...0: (0 ticks this GP) idle=c56/1/0x4000000000000000 softirq=4144529/4144529 fqs=2617 [161632.898554] (detected by 4, t=5256 jiffies, g=8381009, q=766) [161632.904653] Sending NMI from CPU 4 to CPUs 7: [161635.955567] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7 [161635.955568] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler [161636.988584] Shutting down cpus with NMI [161636.988584] Kernel Offset: 0x1ba00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We noticed that you have contacted Intel
Best regards,
David G.
Intel Customer Support Technician

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page