Processors
Intel® Processors, Tools, and Utilities
14397 Discussions

Frequent crashes on i5-11500

dseomn
Beginner
4,363 Views

Hi,

I recently got an i5-11500, and linux is crashing about once a day. The motherboard is ASUS Prime H570-PLUS, BIOS version 0820. CPU microcode is at 0x3c. Details below, has anybody else seen anything like this? Any idea what the BERT errors mean?

During early boot, I often see dmesg lines like:

[ 4.455507] BERT: Error records from previous boot:
[ 4.460399] [Hardware Error]: event severity: fatal
[ 4.465280] [Hardware Error]: Error 0, type: fatal
[ 4.470160] [Hardware Error]: section_type: Firmware Error Record Reference
[ 4.477296] [Hardware Error]: Firmware Error Record Type: SOC Firmware Error Record Type2
[ 4.485645] [Hardware Error]: Revision: 2
[ 4.489825] [Hardware Error]: Record Identifier: 8f87f311-c998-4d9e-a0c4-6065518c4f6d
[ 4.497831] [Hardware Error]: 00000000: 1100c101 00000080 00000000 fe013d78 ............x=..
[ 4.506527] [Hardware Error]: 00000010: 00000000 3cfbf3f9 00000111 0500020e .......<........
[ 4.515225] [Hardware Error]: 00000020: 3e013335 0000c000 00000000 07000000 53.>............
[ 4.523922] [Hardware Error]: 00000030: 0079f3f9 00000211 0500020e 3e013641 ..y.........A6.>
[ 4.532622] [Hardware Error]: 00000040: 0000c000 00000000 06000000 3c5bf3f1 ..............[<
[ 4.541317] [Hardware Error]: 00000050: 00000311 0500020e 3e01394a 000000e0 ........J9.>....
[ 4.550015] [Hardware Error]: 00000060: 01002009 3e0141b6 0000200c 3e01877f . ...A.>. .....>
[ 4.558711] [Hardware Error]: 00000070: 0000200d 3e0650e6 00002012 3e072e3a . ...P.>. ..:..>
[ 4.567410] [Hardware Error]: 00000080: 00002606 3e07595c 0000260e 3e075a6b .&..\Y.>.&..kZ.>
[ 4.576106] [Hardware Error]: 00000090: 0000260f 3e075b57 0000260e 3e075c57 .&..W[.>.&..W\.>
[ 4.584804] [Hardware Error]: 000000a0: 0000260f 3e075d43 00002610 3e075e53 .&..C].>.&..S^.>
[ 4.593500] [Hardware Error]: 000000b0: 0000260e 3e075ff4 0000260f 3e0760e0 .&..._.>.&...`.>
[ 4.602198] [Hardware Error]: 000000c0: 0000260e 3e076257 0000260f 3e076343 .&..Wb.>.&..Cc.>
[ 4.610896] [Hardware Error]: 000000d0: 0000260e 3e076443 0000260f 3e07652f .&..Cd.>.&../e.>
[ 4.619593] [Hardware Error]: 000000e0: 00002610 3e07663f 0000260e 3e0767e0 .&..?f.>.&...g.>
[ 4.628289] [Hardware Error]: 000000f0: 0000260f 3e0768cc 0000260e 3e076a43 .&...h.>.&..Cj.>
[ 4.636986] [Hardware Error]: 00000100: 0000260f 3e076b2f 0000260e 3e076c2f .&../k.>.&../l.>
[ 4.645683] [Hardware Error]: 00000110: 0000260f 3e076d1b 00002610 3e076e2b .&...m.>.&..+n.>
[ 4.654379] [Hardware Error]: 00000120: 0000260e 3e076fcc 0000260f 3e0770b8 .&...o.>.&...p.>
[ 4.663075] [Hardware Error]: 00000130: 0000260e 3e07722f 0000260f 3e07731b .&../r.>.&...s.>
[ 4.671772] [Hardware Error]: 00000140: 0000260e 3e07741b 0000260f 3e077507 .&...t.>.&...u.>
[ 4.680469] [Hardware Error]: 00000150: 00002610 3e077617 0000260e 3e0777b8 .&...v.>.&...w.>
[ 4.689168] [Hardware Error]: 00000160: 0000260f 3e0778a4 00002607 3e0779ff .&...x.>.&...y.>
[ 4.697861] [Hardware Error]: 00000170: 00002013 3e077c69 0000030d 0500020e . ..i|.>........
[ 4.706559] [Hardware Error]: 00000180: 3c91aa05 0000200b 3c91b30f 00002706 ...<. .....<.'..
[ 4.715255] [Hardware Error]: 00000190: 3c91bd10 00002707 3c91be25 00002010 ...<.'..%..<. ..
[ 4.723952] [Hardware Error]: 000001a0: 3c936550 00002011 3c936642 0000c000 Pe.<. ..Bf.<....
[ 4.732648] [Hardware Error]: 000001b0: 00000000 07000000 3c5bf039 00000011 ........9.[<....
[ 4.741344] [Hardware Error]: 000001c0: 0500020e 3c9503f0 00002106 3c9517ae .......<.!.....<
[ 4.750045] [Hardware Error]: 000001d0: 00002107 3c9518b3 00000800 01002203 .!.....<....."..
[ 4.758740] [Hardware Error]: 000001e0: 3d637481 0000200e 3dfe2301 0000200f .tc=. ...#.=. ..
[ 4.767437] [Hardware Error]: 000001f0: 3dfe243b 0000c000 00000000 07c40800 ;$.=............
[ 4.776133] [Hardware Error]: Error 1, type: fatal
[ 4.781015] [Hardware Error]: section_type: Firmware Error Record Reference
[ 4.788150] [Hardware Error]: Firmware Error Record Type: SOC Firmware Error Record Type2
[ 4.796492] [Hardware Error]: Revision: 2
[ 4.800682] [Hardware Error]: Record Identifier: 8f87f311-c998-4d9e-a0c4-6065518c4f6d
[ 4.808685] [Hardware Error]: 00000000: 03028001 00030004 00d3c72d 0000043c ........-...<...
[ 4.817384] [Hardware Error]: 00000010: 000301ff 0000003c 17dac638 27000040 ....<...8...@..'
[ 4.826079] [Hardware Error]: 00000020: 0001004c 00012f5a 000ba1ea 000ba23d L...Z/......=...
[ 4.834778] [Hardware Error]: 00000030: 000ba2ab 000ba22d 000ba22d 0000e5ae ....-...-.......
[ 4.843475] [Hardware Error]: 00000040: 0007053a 00000000 00000000 00000000 :...............
[ 4.852171] [Hardware Error]: 00000050: 00000000 00000000 00000000 00000000 ................
[ 4.860869] [Hardware Error]: 00000060: 00000004 00000000 07edb6c0 0000262e .............&..
[ 4.869563] [Hardware Error]: 00000070: 07edb6c0 0000262e 07edb6c0 0000262e .....&.......&..
[ 4.878260] [Hardware Error]: 00000080: 00000000 00000000 00000000 00000000 ................
[ 4.886957] [Hardware Error]: 00000090: 00000000 170107f4 130267f4 00000000 .........g......
[ 4.895654] [Hardware Error]: 000000a0: 00000000 f20004a1 f20004a1 f20004a1 ................
[ 4.904351] [Hardware Error]: 000000b0: f200734e 150347f4 deadbeef 130107f4 Ns...G..........
[ 4.913046] [Hardware Error]: 000000c0: deadbeef 00000000 00000000 2c180400 ...............,
[ 4.921744] [Hardware Error]: 000000d0: 2c180400 2c180400 2c180400 150347f4 ...,...,...,.G..
[ 4.930442] [Hardware Error]: 000000e0: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.939137] [Hardware Error]: 000000f0: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 4.947836] [Hardware Error]: 00000100: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 4.956531] [Hardware Error]: 00000110: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.965220] [Hardware Error]: 00000120: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 4.973919] [Hardware Error]: 00000130: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 4.982616] [Hardware Error]: 00000140: 00000000 00000000 f20004a1 f20004a1 ................
[ 4.991314] [Hardware Error]: 00000150: f20004a1 f200734e 00000000 00000000 ....Ns..........
[ 5.000010] [Hardware Error]: 00000160: 2c180400 2c180400 2c180400 2c180400 ...,...,...,...,
[ 5.008705] [Hardware Error]: 00000170: 423c2801 0000661f 000000c0 000000c0 .(<B.f..........
[ 5.017404] [Hardware Error]: 00000180: 000000c0 00000090 76eafbe0 00006697 ...........v.f..
[ 5.026103] [Hardware Error]: 00000190: 0000001c 0000001c 0000001c 000003e0 ................
[ 5.034799] [Hardware Error]: 000001a0: 16459f01 00006883 28000000 28000000 ..E..h.....(...(
[ 5.043498] [Hardware Error]: 000001b0: 28000000 28280000 7deb1b93 000068c4 ...(..((...}.h..
[ 5.052193] [Hardware Error]: 000001c0: 7f0f4100 7f0f4100 7f0f4100 7f0f4100 .A...A...A...A..
[ 5.060893] [Hardware Error]: 000001d0: 38bb72e5 00006893 c06075bc c060803c .r.8.h...u`.<.`.
[ 5.069587] [Hardware Error]: 000001e0: c06075bc c060803c 9a7f7cf5 000068da .u`.<.`..|...h..
[ 5.078284] [Hardware Error]: 000001f0: 00000000 00000000 00000000 00000000 ................
[ 5.086981] [Hardware Error]: 00000200: 00000000 000a0671 80000000 00000000 ....q...........
[ 5.095680] [Hardware Error]: 00000210: 50000703 50000703 50000703 50000703 ...P...P...P...P
[ 5.104375] [Hardware Error]: 00000220: 10800303 1100ff07 1100ff07 10800303 ................
[ 5.113071] [Hardware Error]: 00000230: 00000000 10000000 00000000 00000000 ................
[ 5.121770] [Hardware Error]: 00000240: 00000000 00000000 00000000 00000000 ................
[ 5.130464] [Hardware Error]: 00000250: 00000000 00000000 00000000 883a0000 ..............:.
[ 5.139161] [Hardware Error]: 00000260: 00000000 00000000 00000000 00000000 ................
[ 5.147860] [Hardware Error]: 00000270: 00000000 00000000 00000000 00000000 ................
[ 5.156556] [Hardware Error]: 00000280: 00000000 00000000 00000000 00000000 ................
[ 5.165251] [Hardware Error]: 00000290: 00000000 00000000 00000000 00040000 ................
[ 5.173949] [Hardware Error]: 000002a0: f1811b00 08080838 00000000 000000c0 ....8...........
[ 5.182645] [Hardware Error]: 000002b0: 883b0000 883a0000 88410000 883f0000 ..;...:...A...?.
[ 5.191343] [Hardware Error]: 000002c0: 88400000 88410000 88400000 08000000 ..@...A...@.....
[ 5.200038] [Hardware Error]: 000002d0: 08000000 c0030000 c0030000 00030703 ................
[ 5.208734] [Hardware Error]: 000002e0: 40030000 00030703 00030703 00030703 ...@............
[ 5.217431] [Hardware Error]: 000002f0: 00030703 00030703 00030703 00030703 ................
[ 5.226130] [Hardware Error]: 00000300: 00030703 00030303 0003ff07 0003ff07 ................
[ 5.234824] [Hardware Error]: 00000310: 0003ff07 0003ff07 0003ff07 07000000 ................
[ 5.243524] [Hardware Error]: 00000320: deadbeef deadbeef 00000000 deadbeef ................
[ 5.252222] [Hardware Error]: 00000330: deadbeef 00a01330 024ac812 00a51ee9 ....0.....J.....
[ 5.260919] [Hardware Error]: 00000340: 0c006003 00000000 00000000 00000000 .`..............
[ 5.269617] [Hardware Error]: 00000350: 00000000 01f8af92 00000000 00000000 ................
[ 5.278313] [Hardware Error]: 00000360: 00000000 fe0001dc 00000000 00002011 ............. ..
[ 5.287011] [Hardware Error]: 00000370: 0000fbff 00000000 deadbeef deadbeef ................
[ 5.295707] [Hardware Error]: 00000380: c0000034 00000000 00000100 00000000 4...............
[ 5.304405] [Hardware Error]: 00000390: 60000000 00000000 00000000 00000011 ...`............
[ 5.313100] [Hardware Error]: 000003a0: 0000ffff 03030000 00000000 00000000 ................
[ 5.321797] [Hardware Error]: 000003b0: 00000000 03030303 00000104 03000003 ................
[ 5.330494] [Hardware Error]: 000003c0: 00000000 00000000 00000000 03030303 ................
[ 5.339192] [Hardware Error]: 000003d0: 00000105 00000000 00046172 00000000 ........ra......
[ 5.347886] [Hardware Error]: 000003e0: 00000000 00000000 00000000 00000000 ................
[ 5.356582] [Hardware Error]: 000003f0: 00000000 00000000 00000000 00000000 ................
[ 5.365279] [Hardware Error]: 00000400: 00000000 00000000 00000000 00000000 ................
[ 5.373970] [Hardware Error]: 00000410: 00000000 00000000 00000000 00000000 ................
[ 5.382666] [Hardware Error]: 00000420: 00000000 00000000 00000000 00000000 ................
[ 5.391360] [Hardware Error]: 00000430: 00000000 00000001 00000000 40000001 ...............@
[ 5.400058] [Hardware Error]: 00000440: 00000000 3c000000 00000000 00000000 .......<........
[ 5.408753] [Hardware Error]: 00000450: 80000086 00000038 fef00040 00000000 ....8...@.......
[ 5.417449] [Hardware Error]: 00000460: 40000001 00000000 3c000000 00000000 ...@.......<....
[ 5.426148] [Hardware Error]: 00000470: 00000000 80000086 00000078 fef00000 ........x.......
[ 5.434843] [Hardware Error]: 00000480: 00000000 40000001 00000000 3c000000 .......@.......<
[ 5.443540] [Hardware Error]: 00000490: 00000000 00000000 80000086 00000038 ............8...
[ 5.452237] [Hardware Error]: 000004a0: fef00240 00000000 40000001 00000000 @..........@....
[ 5.460932] [Hardware Error]: 000004b0: 3c000000 00000000 00000000 80000086 ...<............
[ 5.469630] [Hardware Error]: 000004c0: 00000078 fef00200 00000000 40000001 x..............@
[ 5.478326] [Hardware Error]: 000004d0: 00000000 3c000000 00000000 00000000 .......<........
[ 5.487023] [Hardware Error]: 000004e0: 00000000 00000000 00000000 00000000 ................
[ 5.495719] [Hardware Error]: 000004f0: 40000001 00000000 3c000000 00000000 ...@.......<....
[ 5.504413] [Hardware Error]: 00000500: 00000000 00000000 00000000 00000000 ................
[ 5.513113] [Hardware Error]: 00000510: 00000000 00000000 00000000 00000000 ................
[ 5.521810] [Hardware Error]: 00000520: 00000000 00200000 00000000 00000000 ...... .........
[ 5.530504] [Hardware Error]: 00000530: 00000000 00000000 00000000 00000000 ................
[ 5.539202] [Hardware Error]: 00000540: 00000000 00000000 00200000 00000000 .......... .....
[ 5.547899] [Hardware Error]: 00000550: 00000000 00000000 00000000 deadbeef ................
[ 5.556595] [Hardware Error]: 00000560: deadbeef 03000f43 deadbeef 00000000 ....C...........
[ 5.565292] [Hardware Error]: 00000570: 00000000 03000f43 deadbeef deadbeef ....C...........
[ 5.573990] [Hardware Error]: 00000580: 21000625 deadbeef 00000100 00000100 %..!............
[ 5.582686] [Hardware Error]: 00000590: 03000f43 deadbeef deadbeef 00000000 C...............
[ 5.591384] [Hardware Error]: 000005a0: deadbeef 00000000 00000000 03000f43 ............C...
[ 5.600080] [Hardware Error]: 000005b0: deadbeef deadbeef 03000f43 deadbeef ........C.......
[ 5.608778] [Hardware Error]: 000005c0: 00000000 00000000 03000f43 deadbeef ........C.......
[ 5.617474] [Hardware Error]: 000005d0: deadbeef 21000625 deadbeef 00001f00 ....%..!........
[ 5.626172] [Hardware Error]: 000005e0: 00000000 03000f43 deadbeef deadbeef ....C...........
[ 5.634867] [Hardware Error]: 000005f0: 00000000 deadbeef 00000000 00000000 ................
[ 5.643566] [Hardware Error]: 00000600: 03000f43 deadbeef deadbeef 03000f43 C...........C...
[ 5.652262] [Hardware Error]: 00000610: deadbeef 00000000 00000000 03000f43 ............C...
[ 5.660959] [Hardware Error]: 00000620: deadbeef deadbeef 21000605 deadbeef ...........!....
[ 5.669658] [Hardware Error]: 00000630: 00000000 00000000 03000f43 deadbeef ........C.......
[ 5.678352] [Hardware Error]: 00000640: deadbeef 00000000 deadbeef 00000000 ................
[ 5.687050] [Hardware Error]: 00000650: 00000000 21000615 deadbeef deadbeef .......!........
[ 5.695746] [Hardware Error]: 00000660: 03000f43 deadbeef 00000000 00000000 C...............
[ 5.704439] [Hardware Error]: 00000670: 00000000 deadbeef deadbeef 21000625 ............%..!
[ 5.713136] [Hardware Error]: 00000680: deadbeef 00000000 00000172 21000625 ........r...%..!
[ 5.721833] [Hardware Error]: 00000690: deadbeef deadbeef 00000000 deadbeef ................
[ 5.730529] [Hardware Error]: 000006a0: 00100000 00900000 00000000 deadbeef ................
[ 5.739224] [Hardware Error]: 000006b0: deadbeef deadbeef 00000006 00000061 ............a...
[ 5.747921] [Hardware Error]: 000006c0: 21000645 deadbeef deadbeef deadbeef E..!............
[ 5.756618] [Hardware Error]: 000006d0: 00000000 00000000 00000000 deadbeef ................
[ 5.765315] [Hardware Error]: 000006e0: deadbeef deadbeef 00000000 20000000 ...............
[ 5.774011] [Hardware Error]: 000006f0: 21000615 deadbeef deadbeef deadbeef ...!............
[ 5.782708] [Hardware Error]: 00000700: 00000000 00000000 00000000 deadbeef ................
[ 5.791403] [Hardware Error]: 00000710: deadbeef deadbeef 00000000 21000615 ...............!
[ 5.800098] [Hardware Error]: 00000720: deadbeef deadbeef deadbeef 00000000 ................
[ 5.808795] [Hardware Error]: 00000730: deadbeef deadbeef 21000615 deadbeef ...........!....
[ 5.817493] [Hardware Error]: 00000740: deadbeef 00000000 deadbeef deadbeef ................
[ 5.826189] [Hardware Error]: 00000750: 21000635 deadbeef deadbeef 00000000 5..!............
[ 5.834886] [Hardware Error]: 00000760: deadbeef deadbeef 21000605 deadbeef ...........!....
[ 5.843579] [Hardware Error]: 00000770: deadbeef 00000000 deadbeef deadbeef ................
[ 5.852276] [Hardware Error]: 00000780: deadbeef deadbeef deadbeef deadbeef ................
[ 5.860974] [Hardware Error]: 00000790: deadbeef deadbeef deadbeef deadbeef ................
[ 5.869670] [Hardware Error]: 000007a0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.878366] [Hardware Error]: 000007b0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.887063] [Hardware Error]: 000007c0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.895760] [Hardware Error]: 000007d0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.904456] [Hardware Error]: 000007e0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.913154] [Hardware Error]: 000007f0: deadbeef deadbeef deadbeef deadbeef ................
[ 5.921852] [Hardware Error]: 00000800: deadbeef deadbeef 00200172 deadbeef ........r. .....
[ 5.930550] [Hardware Error]: 00000810: 17980800 17980800 deadbeef 17980800 ................
[ 5.939247] [Hardware Error]: 00000820: 17980800 deadbeef 17980800 17980800 ................
[ 5.947943] [Hardware Error]: 00000830: deadbeef 17980800 17980800 deadbeef ................
[ 5.956639] [Hardware Error]: 00000840: 17980800 deadbeef 17980800 deadbeef ................
[ 5.965339] [Hardware Error]: 00000850: 17980800 deadbeef 17980800 deadbeef ................
[ 5.974035] [Hardware Error]: 00000860: deadbeef deadbeef deadbeef deadbeef ................
[ 5.982730] [Hardware Error]: 00000870: deadbeef deadbeef deadbeef 00000000 ................
[ 5.991428] [Hardware Error]: 00000880: 00000000 00000000 00000000 00000000 ................
[ 6.000125] [Hardware Error]: 00000890: 00000000 00000000 00000000 00000000 ................
[ 6.008822] [Hardware Error]: 000008a0: 00000000 00000000 00000000 0401e003 ................
[ 6.017519] [Hardware Error]: 000008b0: 00000198 2a2f0322 000072d6 0000003c ...."./*.r..<...
[ 6.026214] [Hardware Error]: 000008c0: 00000003 00000003 00000003 80050033 ............3...
[ 6.034911] [Hardware Error]: 000008d0: 00770ee0 979ab000 00007f3a bf7d6005 ..w.....:....`}.
[ 6.043608] [Hardware Error]: 000008e0: 00000002 00000d01 002b0033 00000000 ........3.+.....
[ 6.052306] [Hardware Error]: 000008f0: 00000000 00000202 00000000 b2801313 ................
[ 6.061000] [Hardware Error]: 00000900: 00007f3a 00000000 00000000 00000000 :...............
[ 6.069700] [Hardware Error]: 00000910: 00000000 00000000 00000000 00000000 ................
[ 6.078397] [Hardware Error]: 00000920: 00000000 00000000 00000000 00000000 ................
[ 6.087092] [Hardware Error]: 00000930: 00000000 00000000 00000000 00000000 ................
[ 6.095791] [Hardware Error]: 00000940: 00000000 00000000 00000000 00000000 ................
[ 6.104489] [Hardware Error]: 00000950: 00000000 00000000 00000000 00000000 ................
[ 6.113186] [Hardware Error]: 00000960: 00000000 00000000 00000000 00000000 ................
[ 6.121882] [Hardware Error]: 00000970: 00000000 00000000 00000000 00000000 ................
[ 6.130578] [Hardware Error]: 00000980: 00000000 00000000 00000000 00000000 ................
[ 6.139279] [Hardware Error]: 00000990: 00000000 00000000 00000000 00000000 ................
[ 6.147971] [Hardware Error]: 000009a0: 00000000 00000000 00000000 00000000 ................
[ 6.156668] [Hardware Error]: 000009b0: 00000000 00000000 00000000 00000000 ................
[ 6.165366] [Hardware Error]: 000009c0: 00000000 00000fff 00000000 00000000 ................
[ 6.174066] [Hardware Error]: 000009d0: 00000000 00000000 00000000 00000000 ................
[ 6.182761] [Hardware Error]: 000009e0: 00000000 00000001 00000000 00000000 ................
[ 6.191458] [Hardware Error]: 000009f0: 00000000 00000000 00000000 00000000 ................
[ 6.200155] [Hardware Error]: 00000a00: 00000000 00000007 00000000 00000000 ................
[ 6.208852] [Hardware Error]: 00000a10: 00000000 00000000 00000000 00000000 ................
[ 6.217546] [Hardware Error]: 00000a20: 00000000 0000003f 00000000 00800400 ....?...........
[ 6.226244] [Hardware Error]: 00000a30: be000000 b2801313 00007f3a b2801313 ........:.......
[ 6.234941] [Hardware Error]: 00000a40: 00007f3a 0401e003 02000198 2a65e5d2 :.............e*
[ 6.243637] [Hardware Error]: 00000a50: 000072d6 0000003c 00000005 00000002 .r..<...........
[ 6.252334] [Hardware Error]: 00000a60: 0000000b 80050033 00770ee0 599a8000 ....3.....w....Y
[ 6.261029] [Hardware Error]: 00000a70: 00007efc 1360a004 00000004 00000d01 .~....`.........
[ 6.269725] [Hardware Error]: 00000a80: 00180010 00000000 00000000 00000046 ............F...
[ 6.278421] [Hardware Error]: 00000a90: 00000000 a6b4782b ffffffff 00000000 ....+x..........
[ 6.287120] [Hardware Error]: 00000aa0: 00000000 00000000 00000000 00000000 ................
[ 6.295816] [Hardware Error]: 00000ab0: 00000000 00000000 00000000 00000000 ................
[ 6.304514] [Hardware Error]: 00000ac0: 00000000 00000000 00000000 00000000 ................
[ 6.313210] [Hardware Error]: 00000ad0: 00000000 00000000 00000000 00000000 ................
[ 6.321905] [Hardware Error]: 00000ae0: 00000000 00000000 00000000 00000000 ................
[ 6.330604] [Hardware Error]: 00000af0: 00000000 00000000 00000000 00000000 ................
[ 6.339297] [Hardware Error]: 00000b00: 00000000 00000000 00000000 00000000 ................
[ 6.347996] [Hardware Error]: 00000b10: 00000000 00000000 00000000 00000000 ................
[ 6.356690] [Hardware Error]: 00000b20: 00000000 00000000 00000000 00000000 ................
[ 6.365388] [Hardware Error]: 00000b30: 00000000 00000000 00000000 00000000 ................
[ 6.374084] [Hardware Error]: 00000b40: 00000000 00000000 00000000 00000000 ................
[ 6.382779] [Hardware Error]: 00000b50: 00000000 00000000 00000000 00000fff ................
[ 6.391473] [Hardware Error]: 00000b60: 00000000 00000000 00000000 00000000 ................
[ 6.400171] [Hardware Error]: 00000b70: 00000000 00000000 00000000 00000001 ................
[ 6.408867] [Hardware Error]: 00000b80: 00000000 00000000 00000000 00000000 ................
[ 6.417562] [Hardware Error]: 00000b90: 00000000 00000000 00000000 00000007 ................
[ 6.426258] [Hardware Error]: 00000ba0: 00000000 00000000 00000000 00000000 ................
[ 6.434955] [Hardware Error]: 00000bb0: 00000000 00000000 00000000 0000003f ............?...
[ 6.443650] [Hardware Error]: 00000bc0: 00000000 00800400 be000000 b2801313 ................
[ 6.452348] [Hardware Error]: 00000bd0: 00007f3a b2801313 00007f3a 00001400 :.......:.......
[ 6.461041] [Hardware Error]: 00000be0: 00000094 00003180 00002b80 00003180 .....1...+...1..
[ 6.469741] [Hardware Error]: 00000bf0: 00002b88 00003180 00002b8c 00003180 .+...1...+...1..
[ 6.478436] [Hardware Error]: 00000c00: 00002b84 00003180 00002b8a 00002fcb .+...1...+.../..
[ 6.487134] [Hardware Error]: 00000c10: 00001786 00003180 00002b94 00003180 .....1...+...1..
[ 6.495829] [Hardware Error]: 00000c20: 00002b96 000043a7 00000790 00003180 .+...C.......1..
[ 6.504525] [Hardware Error]: 00000c30: 00002b98 00002a1c 000007f6 00002d4b .+...*......K-..
[ 6.513223] [Hardware Error]: 00000c40: 00002810 00002d4b 0000280a 00002d4b .(..K-...(..K-..
[ 6.521919] [Hardware Error]: 00000c50: 00002812 000029e7 000001c6 00002616 .(...).......&..
[ 6.530615] [Hardware Error]: 00000c60: 00003bc1 00003180 00002b83 000043a7 .;...1...+...C..
[ 6.539318] [Hardware Error]: 00000c70: 00000789 00003180 00002b8b 00003180 .....1...+...1..
[ 6.548008] [Hardware Error]: 00000c80: 00002b85 00003180 00002b93 00003180 .+...1...+...1..
[ 6.556706] [Hardware Error]: 00000c90: 00002b8d 00003180 00002b95 00003e49 .+...1...+..I>..
[ 6.565402] [Hardware Error]: 00000ca0: 0000038f 00003180 00002b97 00002a1c .....1...+...*..
[ 6.574100] [Hardware Error]: 00000cb0: 000007f7 0000337d 00003233 0000337d ....}3..32..}3..
[ 6.582797] [Hardware Error]: 00000cc0: 00003235 00002eb8 00003857 00003187 52......W8...1..
[ 6.591491] [Hardware Error]: 00000cd0: 00001683 0000293a 00003e85 01400080 ....:)...>....@.
[ 6.600189] [Hardware Error]: 00000ce0: 01400280 018c5150 018c4440 0188095c ..@.PQ..@D..\...
[ 6.608885] [Hardware Error]: 00000cf0: 01880468 00020090 00020d40 00820150 h.......@...P...
[ 6.617583] [Hardware Error]: 00000d00: 00820d40 00020090 00020d40 00020150 @.......@...P...
[ 6.626279] [Hardware Error]: 00000d10: 00020840 00020018 00020d40 00020018 @.......@.......
[ 6.634975] [Hardware Error]: 00000d20: 00020d40 00020150 00020840 00020018 @...P...@.......
[ 6.643672] [Hardware Error]: 00000d30: 00020d40 00020018 00020d40 00020018 @.......@.......
[ 6.652366] [Hardware Error]: 00000d40: 00020d40 00020150 00020d40 00020018 @...P...@.......
[ 6.661065] [Hardware Error]: 00000d50: 00020d40 010c3890 010c0460 00036010 @....8..`....`..
[ 6.669760] [Hardware Error]: 00000d60: 00034540 0188195c 01880468 00020090 @E..\...h.......
[ 6.678457] [Hardware Error]: 00000d70: 00020840 00020150 00020d40 00820150 @...P...@...P...
[ 6.687155] [Hardware Error]: 00000d80: 00820d40 00020018 00020d40 00020150 @.......@...P...
[ 6.695850] [Hardware Error]: 00000d90: 00020d40 00020018 00020d40 00020150 @.......@...P...
[ 6.704546] [Hardware Error]: 00000da0: 00020840 00020018 00020d40 00020018 @.......@.......
[ 6.713245] [Hardware Error]: 00000db0: 00020d40 00020018 00020d40 00020018 @.......@.......
[ 6.721940] [Hardware Error]: 00000dc0: 00020d40 00020018 00020d40 00020090 @.......@.......
[ 6.730636] [Hardware Error]: 00000dd0: 00020840 00020090 00020d40 00000000 @.......@.......
[ 6.739332] [Hardware Error]: 00000de0: 00000000 00000000 00000000 00000000 ................
[ 6.748029] [Hardware Error]: 00000df0: 00000000 00000000 00000000 00000000 ................
[ 6.756725] [Hardware Error]: 00000e00: 00000000 00000000 00000000 00000000 ................
[ 6.765424] [Hardware Error]: 00000e10: 00000000 00000000 00000000 00000000 ................
[ 6.774120] [Hardware Error]: 00000e20: 00000000 00000000 00000000 00000000 ................
[ 6.782815] [Hardware Error]: 00000e30: 00000000 00000000 00000000 00000000 ................
[ 6.791513] [Hardware Error]: 00000e40: 00000000 00000000 00000000 00000000 ................
[ 6.800210] [Hardware Error]: 00000e50: 00000000 00000000 00000000 00000000 ................
[ 6.808910] [Hardware Error]: 00000e60: 00000000 00000000 00000000 00000000 ................
[ 6.817604] [Hardware Error]: 00000e70: 00000000 00000000 00000000 00000000 ................
[ 6.826301] [Hardware Error]: 00000e80: 00000000 00000000 00000000 00000000 ................
[ 6.834996] [Hardware Error]: 00000e90: 00000000 00000000 00000000 00000000 ................
[ 6.843693] [Hardware Error]: 00000ea0: 00000000 00000000 00000000 00000000 ................
[ 6.852389] [Hardware Error]: 00000eb0: 00000000 00000000 00000000 00000000 ................
[ 6.861083] [Hardware Error]: 00000ec0: 00000000 00000000 00000000 00000000 ................
[ 6.869778] [Hardware Error]: 00000ed0: 00000000 00000000 00000000 00000000 ................
[ 6.878473] [Hardware Error]: 00000ee0: 00000000 00000000 00000000 00000000 ................
[ 6.887170] [Hardware Error]: 00000ef0: 00000000 00000000 00000000 00000000 ................
[ 6.895866] [Hardware Error]: 00000f00: 00000000 00000000 00000000 00000000 ................
[ 6.904562] [Hardware Error]: 00000f10: 00000000 00000000 00000000 00000000 ................
[ 6.913256] [Hardware Error]: 00000f20: 00000000 00000000 00000000 00000000 ................
[ 6.921951] [Hardware Error]: 00000f30: 00000000 00000000 00000000 00000000 ................
[ 6.930648] [Hardware Error]: 00000f40: 00000000 00000000 00000000 00000000 ................
[ 6.939343] [Hardware Error]: 00000f50: 00000000 00000000 00000000 00000000 ................
[ 6.948038] [Hardware Error]: 00000f60: 00000000 00000000 00000000 00000000 ................
[ 6.956735] [Hardware Error]: 00000f70: 00000000 00000000 00000000 00000000 ................
[ 6.965430] [Hardware Error]: 00000f80: 00000000 00000000 00000000 00000000 ................
[ 6.974128] [Hardware Error]: 00000f90: 00000000 00000000 00000000 00000000 ................
[ 6.982824] [Hardware Error]: 00000fa0: 00000000 00000000 00000000 00000000 ................
[ 6.991521] [Hardware Error]: 00000fb0: 00000000 00000000 00000000 00000000 ................
[ 7.000216] [Hardware Error]: 00000fc0: 00000000 00000000 00000000 00000000 ................
[ 7.008912] [Hardware Error]: 00000fd0: 00000000 00000000 00000000 00000000 ................
[ 7.017608] [Hardware Error]: 00000fe0: 00000000 00000000 00000000 00000000 ................
[ 7.026306] [Hardware Error]: 00000ff0: 00000000 00000000 00000000 00000000 ................

And I've seen three different types of crashes so far:

[ 1845.677713] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1845.683869] rcu: 7-...0: (14 ticks this GP) idle=dfa/1/0x4000000000000000 softirq=62174/62174 fqs=1288
[ 1845.693669] (detected by 3, t=5252 jiffies, g=239605, q=1208)
[ 1845.699733] Sending NMI from CPU 3 to CPUs 7:
[ 1848.352177] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[ 1848.352177] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[ 1849.386424] Shutting down cpus with NMI
[ 1849.386425] Kernel Offset: 0x36200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[46536.264594] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[46536.264596] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[46537.298843] Shutting down cpus with NMI
[46537.298843] Kernel Offset: 0x25600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 5936.397108] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
[ 5936.397110] Modules linked in: ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E) jfs(E) xfs(E) overlay(E) bridge(E) stp(E) intel_rapl_msr(E) llc(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_sof_pci_intel_tgl(E) ghash_clmulni_intel(E) binfmt_misc(E) snd_sof_intel_hda_common(E) soundwire_intel(E) soundwire_generic_allocation(E) soundwire_cadence(E) snd_sof_intel_hda(E) snd_sof_pci(E) snd_sof_xtensa_dsp(E) snd_sof(E) snd_soc_hdac_hda(E) snd_hda_ext_core(E) aesni_intel(E) snd_hda_codec_hdmi(E) snd_soc_acpi_intel_match(E) libaes(E) snd_soc_acpi(E) nft_masq(E) crypto_simd(E) snd_soc_core(E) cryptd(E) snd_hda_codec_realtek(E) snd_compress(E) intel_cstate(E) intel_uncore(E) soundwire_bus(E) snd_hda_codec_generic(E) eeepc_wmi(E) ledtrig_audio(E) pcspkr(E) uvcvideo(E) asus_wmi(E) snd_hda_intel(E) battery(E) snd_intel_dspcfg(E) videobuf2_vmalloc(E) nft_chain_nat(E) snd_intel_sdw_acpi(E) sparse_keymap(E) videobuf2_memops(E) efi_pstore(E) rfkill(E) nf_nat(E)
[ 5936.397127] snd_usb_audio(E) wmi_bmof(E) iTCO_wdt(E) snd_hda_codec(E) videobuf2_v4l2(E) intel_pmc_bxt(E) snd_usbmidi_lib(E) videobuf2_common(E) snd_hda_core(E) snd_rawmidi(E) iTCO_vendor_support(E) nft_ct(E) ee1004(E) watchdog(E) nls_ascii(E) snd_seq_device(E) snd_hwdep(E) nls_cp437(E) snd_pcm(E) nf_conntrack(E) videodev(E) snd_timer(E) vfat(E) nf_defrag_ipv6(E) fat(E) nf_defrag_ipv4(E) mc(E) joydev(E) snd(E) soundcore(E) mei_me(E) sg(E) mei(E) evdev(E) intel_pmc_core(E) acpi_tad(E) acpi_pad(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) nf_tables(E) nfnetlink(E) fuse(E) configfs(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) sd_mod(E) cdrom(E) dm_mod(E) ahci(E) i915(E) libahci(E)
[ 5936.397149] nvme(E) i2c_algo_bit(E) xhci_pci(E) nvme_core(E) e1000e(E) drm_kms_helper(E) t10_pi(E) crc_t10dif(E) crc32_pclmul(E) ptp(E) xhci_hcd(E) crct10dif_generic(E) intel_lpss_pci(E) crc32c_intel(E) cec(E) i2c_i801(E) libata(E) crct10dif_pclmul(E) pps_core(E) intel_lpss(E) i2c_smbus(E) scsi_mod(E) usbcore(E) idma64(E) crct10dif_common(E) drm(E) fan(E) video(E) wmi(E) button(E)
[ 5936.397157] CPU: 2 PID: 5604 Comm: WRRende~ckend#1 Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5936.397158] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5936.397158] RIP: 0010:native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5936.397159] Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 0f 85 11 01 00 00 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 ba 00 02 00 00
[ 5936.397159] RSP: 0000:ffffacf78177fd38 EFLAGS: 00000002
[ 5936.397160] RAX: 00000000002c0101 RBX: ffffacf78177fd98 RCX: 0000000000000000
[ 5936.397160] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9951495f7450
[ 5936.397161] RBP: fffff5ae13112240 R08: ffff99605f0a8900 R09: 0000000000000000
[ 5936.397161] R10: 0000000000000011 R11: 0000000000000100 R12: 0000000000000246
[ 5936.397162] R13: 0000000000000000 R14: fff0000000000fff R15: fffff5ae131e02c0
[ 5936.397162] FS: 00007f301fafd700(0000) GS:ffff99605f080000(0000) knlGS:0000000000000000
[ 5936.397163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5936.397163] CR2: 00007f300a1bc000 CR3: 0000000257cfc003 CR4: 0000000000770ee0
[ 5936.397163] PKRU: 55555554
[ 5936.397164] Call Trace:
[ 5936.397164] _raw_spin_lock_irqsave+0x32/0x40
[ 5936.397164] lock_page_lruvec_irqsave+0x52/0x80
[ 5936.397165] __pagevec_lru_add+0x1db/0x3e0
[ 5936.397165] ? mem_cgroup_charge_statistics.constprop.0+0x21/0x50
[ 5936.397165] lru_cache_add+0x5c/0x70
[ 5936.397166] __handle_mm_fault+0xd00/0x17b0
[ 5936.397166] handle_mm_fault+0xd5/0x2b0
[ 5936.397166] do_user_addr_fault+0x1ba/0x670
[ 5936.397167] exc_page_fault+0x7b/0x160
[ 5936.397167] ? asm_exc_page_fault+0x8/0x30
[ 5936.397167] asm_exc_page_fault+0x1e/0x30
[ 5936.397168] RIP: 0033:0x7f304668aa22
[ 5936.397168] Code: fe 70 01 00 00 c7 04 3a 00 00 00 00 89 5c 3a 04 44 89 64 3a 08 0f 28 84 24 f0 02 00 00 0f 28 8c 24 00 03 00 00 0f 11 44 3a 18 <0f> 11 4c 3a 28 48 8b ac 24 10 03 00 00 48 89 6c 3a 38 48 c7 44 3a
[ 5936.397169] RSP: 002b:00007f301faf4fb0 EFLAGS: 00010202
[ 5936.397169] RAX: 00007f301ffc6b90 RBX: 000000000000002c RCX: 0000000000000000
[ 5936.397170] RDX: 00007f300a1bb000 RSI: 000000000000000b RDI: 0000000000000fd0
[ 5936.397170] RBP: 00007f301faf5570 R08: 000000003f800000 R09: 00000000000003ff
[ 5936.397171] R10: 000000000000000b R11: 00007f2fa6803688 R12: 0000000000000019
[ 5936.397171] R13: 00007f301ffc69b0 R14: 0000000000000000 R15: 0000000000000001
[ 5936.397172] Kernel panic - not syncing: Hard LOCKUP
[ 5937.439044] Shutting down cpus with NMI
[ 5937.439045] Kernel Offset: 0x38a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 5937.439046] CPU: 2 PID: 5604 Comm: WRRende~ckend#1 Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5937.439046] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5937.439047] Call Trace:
[ 5937.439047] <NMI>
[ 5937.439047] dump_stack+0x76/0x94
[ 5937.439047] panic+0x13b/0x2d7
[ 5937.439048] nmi_panic.cold+0xc/0xc
[ 5937.439048] watchdog_overflow_callback.cold+0x7c/0x7e
[ 5937.439048] __perf_event_overflow+0x83/0x1c0
[ 5937.439049] handle_pmi_common+0x205/0x2e0
[ 5937.439049] intel_pmu_handle_irq+0xec/0x310
[ 5937.439049] perf_event_nmi_handler+0x28/0x50
[ 5937.439050] nmi_handle+0x58/0x100
[ 5937.439050] default_do_nmi+0x42/0x130
[ 5937.439050] exc_nmi+0x12f/0x150
[ 5937.439051] end_repeat_nmi+0x16/0x55
[ 5937.439051] RIP: 0010:native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439052] Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 0f 85 11 01 00 00 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 ba 00 02 00 00
[ 5937.439052] RSP: 0000:ffffacf78177fd38 EFLAGS: 00000002
[ 5937.439053] RAX: 00000000002c0101 RBX: ffffacf78177fd98 RCX: 0000000000000000
[ 5937.439053] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9951495f7450
[ 5937.439054] RBP: fffff5ae13112240 R08: ffff99605f0a8900 R09: 0000000000000000
[ 5937.439054] R10: 0000000000000011 R11: 0000000000000100 R12: 0000000000000246
[ 5937.439055] R13: 0000000000000000 R14: fff0000000000fff R15: fffff5ae131e02c0
[ 5937.439055] ? native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439056] ? native_queued_spin_lock_slowpath+0x5e/0x1d0
[ 5937.439056] </NMI>
[ 5937.439056] _raw_spin_lock_irqsave+0x32/0x40
[ 5937.439057] lock_page_lruvec_irqsave+0x52/0x80
[ 5937.439057] __pagevec_lru_add+0x1db/0x3e0
[ 5937.439057] ? mem_cgroup_charge_statistics.constprop.0+0x21/0x50
[ 5937.439058] lru_cache_add+0x5c/0x70
[ 5937.439058] __handle_mm_fault+0xd00/0x17b0
[ 5937.439058] handle_mm_fault+0xd5/0x2b0
[ 5937.439059] do_user_addr_fault+0x1ba/0x670
[ 5937.439059] exc_page_fault+0x7b/0x160
[ 5937.439059] ? asm_exc_page_fault+0x8/0x30
[ 5937.439060] asm_exc_page_fault+0x1e/0x30
[ 5937.439060] RIP: 0033:0x7f304668aa22
[ 5937.439060] Code: fe 70 01 00 00 c7 04 3a 00 00 00 00 89 5c 3a 04 44 89 64 3a 08 0f 28 84 24 f0 02 00 00 0f 28 8c 24 00 03 00 00 0f 11 44 3a 18 <0f> 11 4c 3a 28 48 8b ac 24 10 03 00 00 48 89 6c 3a 38 48 c7 44 3a
[ 5937.439061] RSP: 002b:00007f301faf4fb0 EFLAGS: 00010202
[ 5937.439062] RAX: 00007f301ffc6b90 RBX: 000000000000002c RCX: 0000000000000000
[ 5937.439062] RDX: 00007f300a1bb000 RSI: 000000000000000b RDI: 0000000000000fd0
[ 5937.439063] RBP: 00007f301faf5570 R08: 000000003f800000 R09: 00000000000003ff
[ 5937.439063] R10: 000000000000000b R11: 00007f2fa6803688 R12: 0000000000000019
[ 5937.439064] R13: 00007f301ffc69b0 R14: 0000000000000000 R15: 0000000000000001
[ 5937.439064] NMI watchdog: Watchdog detected hard LOCKUP on cpu 8
[ 5937.439065] Modules linked in: ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E) jfs(E) xfs(E) overlay(E) bridge(E) stp(E) intel_rapl_msr(E) llc(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_sof_pci_intel_tgl(E) ghash_clmulni_intel(E) binfmt_misc(E) snd_sof_intel_hda_common(E) soundwire_intel(E) soundwire_generic_allocation(E) soundwire_cadence(E) snd_sof_intel_hda(E) snd_sof_pci(E) snd_sof_xtensa_dsp(E) snd_sof(E) snd_soc_hdac_hda(E) snd_hda_ext_core(E) aesni_intel(E) snd_hda_codec_hdmi(E) snd_soc_acpi_intel_match(E) libaes(E) snd_soc_acpi(E) nft_masq(E) crypto_simd(E) snd_soc_core(E) cryptd(E) snd_hda_codec_realtek(E) snd_compress(E) intel_cstate(E) intel_uncore(E) soundwire_bus(E) snd_hda_codec_generic(E) eeepc_wmi(E) ledtrig_audio(E) pcspkr(E) uvcvideo(E) asus_wmi(E) snd_hda_intel(E) battery(E) snd_intel_dspcfg(E) videobuf2_vmalloc(E) nft_chain_nat(E) snd_intel_sdw_acpi(E) sparse_keymap(E) videobuf2_memops(E) efi_pstore(E) rfkill(E) nf_nat(E)
[ 5937.439082] snd_usb_audio(E) wmi_bmof(E) iTCO_wdt(E) snd_hda_codec(E) videobuf2_v4l2(E) intel_pmc_bxt(E) snd_usbmidi_lib(E) videobuf2_common(E) snd_hda_core(E) snd_rawmidi(E) iTCO_vendor_support(E) nft_ct(E) ee1004(E) watchdog(E) nls_ascii(E) snd_seq_device(E) snd_hwdep(E) nls_cp437(E) snd_pcm(E) nf_conntrack(E) videodev(E) snd_timer(E) vfat(E) nf_defrag_ipv6(E) fat(E) nf_defrag_ipv4(E) mc(E) joydev(E) snd(E) soundcore(E) mei_me(E) sg(E) mei(E) evdev(E) intel_pmc_core(E) acpi_tad(E) acpi_pad(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) nf_tables(E) nfnetlink(E) fuse(E) configfs(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) sd_mod(E) cdrom(E) dm_mod(E) ahci(E) i915(E) libahci(E)
[ 5937.439103] nvme(E) i2c_algo_bit(E) xhci_pci(E) nvme_core(E) e1000e(E) drm_kms_helper(E) t10_pi(E) crc_t10dif(E) crc32_pclmul(E) ptp(E) xhci_hcd(E) crct10dif_generic(E) intel_lpss_pci(E) crc32c_intel(E) cec(E) i2c_i801(E) libata(E) crct10dif_pclmul(E) pps_core(E) intel_lpss(E) i2c_smbus(E) scsi_mod(E) usbcore(E) idma64(E) crct10dif_common(E) drm(E) fan(E) video(E) wmi(E) button(E)
[ 5937.439111] CPU: 8 PID: 7339 Comm: flacparse23:sin Tainted: G U E 5.12.0-rc8-dseomn #1
[ 5937.439112] Hardware name: ASUS System Product Name/PRIME H570-PLUS, BIOS 0820 04/27/2021
[ 5937.439112] RIP: 0010:native_queued_spin_lock_slowpath+0x19f/0x1d0
[ 5937.439113] Code: c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 40 cd 02 00 48 03 04 f5 00 39 b8 ba 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 32 48 85 f6 0f 84 67 ff ff ff 0f 0d 0e
[ 5937.439114] RSP: 0018:ffffacf780a17b68 EFLAGS: 00000046
[ 5937.439114] RAX: 0000000000000000 RBX: ffffacf780a17bc8 RCX: 0000000000240000
[ 5937.439115] RDX: ffff99605f22cd40 RSI: 0000000000000005 RDI: ffff9951495f7450
[ 5937.439115] RBP: fffff5ae0442a7c0 R08: 0000000000240000 R09: 0000000000000000
[ 5937.439116] R10: 0000000000000008 R11: 0000000000000100 R12: 0000000000000246
[ 5937.439116] R13: 0000000000000000 R14: ffff9953c77ded18 R15: fffff5ae0dd1e9c0
[ 5937.439117] FS: 00007f3c3d2e7700(0000) GS:ffff99605f200000(0000) knlGS:0000000000000000
[ 5937.439117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5937.439117] CR2: 00007f2fa7d29000 CR3: 00000001f7f28005 CR4: 0000000000770ee0
[ 5937.439118] PKRU: 55555554
[ 5937.439118] Call Trace:
[ 5937.439118] _raw_spin_lock_irqsave+0x32/0x40
[ 5937.439119] lock_page_lruvec_irqsave+0x52/0x80
[ 5937.439119] __pagevec_lru_add+0x1db/0x3e0
[ 5937.439119] ? __add_to_page_cache_locked+0x19e/0x3b0
[ 5937.439120] lru_cache_add+0x5c/0x70
[ 5937.439120] add_to_page_cache_lru+0x72/0xc0
[ 5937.439120] page_cache_ra_unbounded+0x14d/0x230
[ 5937.439121] ? xas_load+0x5/0x70
[ 5937.439121] filemap_get_pages+0x209/0x5d0
[ 5937.439121] filemap_read+0xa7/0x350
[ 5937.439122] new_sync_read+0x115/0x1a0
[ 5937.439122] vfs_read+0xf4/0x180
[ 5937.439122] ksys_read+0x5f/0xe0
[ 5937.439122] do_syscall_64+0x33/0x80
[ 5937.439123] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 5937.439123] RIP: 0033:0x7f3c86ed108c
[ 5937.439124] Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 89 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 bf fc ff ff 48
[ 5937.439124] RSP: 002b:00007f3c3d2e65e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 5937.439125] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3c86ed108c
[ 5937.439126] RDX: 0000000000010000 RSI: 00007f3c2c0ea7e0 RDI: 0000000000000018
[ 5937.439126] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f3c83d919b0
[ 5937.439126] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
[ 5937.439127] R13: 00007f3c3d2e7680 R14: 0000000000010000 R15: 0000000003080ed0

 

Labels (1)
0 Kudos
10 Replies
David_G_Intel
Moderator
4,330 Views

Hello dseomn

  

Thank you for posting on the Intel️® communities. 

 

Please provide more information for this request:

  • Which troubleshooting steps did you try already?
  • What RAM do you use on this system? (Part number)
  • Does this happen with a different operating system?
  • Do you have the chance to test the processor on another system?

 

Regards, 

David G 

Intel Customer Support Technician 


0 Kudos
dseomn
Beginner
4,319 Views

I ran MemTest86 version 9.0 for 4 full passes (about 48 hours), with no errors. Then I ran it again a second time, but I think I cancelled it after 1-2 passes (still no errors). I tried to get it to crash predictably by running stress-ng, but that didn't cause a crash. I tried a bunch of combinations of kernel parameters, but so far none of them have been stable for more than about 2 days. I also wanted to try out https://www.intel.com/content/www/us/en/support/articles/000005567/processors.html but it didn't list support for my processor.

It has 4 sticks (2 packs) of "G.SKILL Ripjaws V Series 32GB (2 x 16GB) 288-Pin DDR4 SDRAM DDR4 2133 (PC4 17000) Intel Z170 Platform / Intel X99 Platform Desktop Memory Model F4-2133C15D-32GVR". Note that the ram is DDR4-2133, but https://ark.intel.com/content/www/us/en/ark/products/212277/intel-core-i5-11500-processor-12m-cache-up-to-4-60-ghz.html says the processor supports DDR4-3200. Is it possible that using slower RAM than the processor supports would cause a problem? I thought I ruled that out with MemTest86, but I'm not sure.

I don't have a Windows license, and I'd rather not pay for one just for this. Some people mentioned that it might be possible to get a shell in the Windows installer without a license though. Is that worth trying?

I don't have any other motherboards that are compatible with this processor.

Also, I have never overclocked any of these components.

0 Kudos
n_scott_pearson
Super User
4,308 Views

You don't have an old Windows 7, Windows 8 or Windows 8.1 license? You do know that you can still use them to install Windows 10 from scratch for free, right?

Just saying,

...S

0 Kudos
dseomn
Beginner
4,302 Views

Nope, newest Windows license I have is for XP, but I'm using that for a virtual machine, and I'd also be a bit surprised if it were even possible to install Windows XP on 2021 hardware. I might be able to dig up a Windows ME or 98 license, or DOS 5, but those are even less likely to work or be useful.

0 Kudos
dseomn
Beginner
4,281 Views

It's been stable for three and a half days now with some power saving features disabled by the "processor.max_cstate=0 intel_idle.max_cstate=0" flags:

dseomn@solaria:~$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-5.12.0-rc8-dseomn root=/dev/mapper/solaria--vg--ssd-root ro i915.force_probe=4c8a log_buf_len=1M nmi_watchdog=panic,1 sysctl.kernel.sysrq=1 console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash
dseomn@solaria:~$ uptime
12:44:07 up 3 days, 13:47, 2 users, load average: 0,51, 0,82, 1,00 

 For a 65W processor, that's really more of a workaround than a fix. And it looks like some other Intel processors have had similar issues in the past: https://en.wikipedia.org/wiki/Silvermont#Erratum. Is there any way to tell if this is a kernel bug, a microcode bug, a bad processor, or something else?

0 Kudos
dseomn
Beginner
4,254 Views

Shoot, it just crashed again, after about 32 hours, with the kernel command line and log output below.

BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel intel_idle.states_off=0xfffffffc splash 
[114080.948951] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[114080.955184] rcu: 7-...0: (1 GPs behind) idle=91e/1/0x4000000000000000 softirq=2730890/2730891 fqs=1660
[114080.965050] (detected by 6, t=5256 jiffies, g=7246105, q=609)
[114080.971171] Sending NMI from CPU 6 to CPUs 7:
[114083.984914] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[114085.017995] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[114085.017995] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[114085.017995] Shutting down cpus with NMI
[114085.017996] Kernel Offset: 0xc200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

From reading https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html and https://www.kernel.org/doc/html/latest/admin-guide/pm/intel_idle.html it looks like intel_idle.states_off is recommended over the two max_cstate parameters, and I think a value of 0xfffffffc disables the same c-states as the previous parameters. (I was able to confirm from /sys/devices/system/cpu/cpu0/cpuidle/state0/default_status and similar files that only state0 and state1 were enabled with either "intel_idle.states_off=0xfffffffc" or "processor.max_cstate=0 intel_idle.max_cstate=0".) Since I also changed to the latest drm-tip kernel (because the i915 driver doesn't work well on any stable kernel yet) and removed a few other parameters, I'll try just changing the c-state parameters this time.

In case it helps anybody else looking into something similar, I found "Supporting 11th Generation Intel® Core™ Processor Families for Desktop Platform, formerly known as Rocket Lake" volume 1 section 4.2.2 "Low-Power Idle States" from https://www.intel.com/content/www/us/en/products/docs/processors/core/core-technical-resources.html really useful for understanding what this specific processor is actually doing at the various c-states.

Also, I've noticed that logical CPUs 1 and 7 seem to be mentioned in most of the crashes. https://www.kernel.org/doc/Documentation/x86/topology.txt says "Many BIOSes enumerate all threads 0 first and then all threads 1" which would indicate that logical CPUs 1 and 7 (on a 1-socket 6-core 12-thread system) are in the same physical core. Is there any reason some things would tend to run on those logical CPUs more than others, or is that indicative of a defective core?

0 Kudos
dseomn
Beginner
4,249 Views

Yup, looks like logical CPUs 1 and 7 are on the same physical core:

dseomn@solaria:~$ cat /sys/devices/system/cpu/cpu1/topology/core_id 
1
dseomn@solaria:~$ cat /sys/devices/system/cpu/cpu7/topology/core_id
1
0 Kudos
dseomn
Beginner
4,222 Views

Somebody suggested that I look at interrupt affinity as another possible explanation for why logical CPUs 1 and 7 seem to be implicated in almost all the crashes. I have no idea if interrupt affinity is stable across reboots, but the current interrupt info is below. It looks like CPU 1 is getting acpi, mei_me, and ahci interrupts more than most other CPUs, and CPU 7 is getting ttyS0 (serial console) and eno1 (ethernet port) more than most others. No idea if any of those are relevant.

dseomn@solaria:~$ uptime
 23:06:24 up 21:48,  3 users,  load average: 2,77, 3,00, 3,13
dseomn@solaria:~$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash
dseomn@solaria:~$ cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      
   0:         42          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    2-edge      timer
   4:          0          0        262          0          0          0          0       2134          0          0          0          0  IR-IO-APIC    4-edge      ttyS0
   8:          0          0          0          0          0          0          1          0          0          0          0          0  IR-IO-APIC    8-edge      rtc0
   9:          9          7          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    9-fasteoi   acpi
  14:          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC   14-fasteoi   INT34C6:00
  16:          0          0          0          0          0          0          0          0          8          0          0          0  IR-IO-APIC   16-fasteoi   i801_smbus
  27:          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC   27-fasteoi   idma64.0, i2c_designware.0
 120:          0          0          0          0          0          0          0          0          0          0          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0          0          0          0          0          0          0          0          0          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 98304-edge      aerdrv, pcie-dpc
 125:          0         43          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 360448-edge      mei_me
 126:    3441094          0          0          0          0          0          0          0          0          0          0        129  IR-PCI-MSI 327680-edge      xhci_hcd
 127:      46561      71271          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 376832-edge      ahci[0000:00:17.0]
 128:          0          0          0      16331          0          0          0          0          0          0          0          0  IR-PCI-MSI 514048-edge      snd_hda_intel:card0
 129:          0          0          0          0          0          0         72          0          0          0          0          0  IR-PCI-MSI 524288-edge      nvme0q0
 130:      67083          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 524289-edge      nvme0q1
 131:          0      57170          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 524290-edge      nvme0q2
 132:          0          0      67648          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 524291-edge      nvme0q3
 133:          0          0          0      58026          0          0          0          0          0          0          0          0  IR-PCI-MSI 524292-edge      nvme0q4
 134:          0          0          0          0      68120          0          0          0          0          0          0          0  IR-PCI-MSI 524293-edge      nvme0q5
 135:          0          0          0          0          0      57513          0          0          0          0          0          0  IR-PCI-MSI 524294-edge      nvme0q6
 136:          0          0          0          0          0          0      51393          0          0          0          0          0  IR-PCI-MSI 524295-edge      nvme0q7
 137:          0          0          0          0          0          0          0      54215          0          0          0          0  IR-PCI-MSI 524296-edge      nvme0q8
 138:          0          0          0          0          0          0          0          0      58442          0          0          0  IR-PCI-MSI 524297-edge      nvme0q9
 139:          0          0          0          0          0          0          0          0          0      51665          0          0  IR-PCI-MSI 524298-edge      nvme0q10
 140:          0          0          0          0          0          0          0          0          0          0      56664          0  IR-PCI-MSI 524299-edge      nvme0q11
 141:    4039954          0          0       2985          0          0          0          0          0          0          0          0  IR-PCI-MSI 32768-edge      i915
 142:          0          0          0          0          0          0          0          0          0          0          0      57640  IR-PCI-MSI 524300-edge      nvme0q12
 143:          0          0          0          0          0          0          0    1234705          0          0          0          0  IR-PCI-MSI 520192-edge      eno1
 NMI:          6        377        394        382        384        372        399        340        351        355        351        333   Non-maskable interrupts
 LOC:    4818648    3542369    3281426    3160353    3089317    2956860    2984196    4211558    3131966    2857497    2852176    2815124   Local timer interrupts
 SPU:          0          0          0          0          0          0          0          0          0          0          0          0   Spurious interrupts
 PMI:          6        377        394        382        384        372        399        340        351        355        351        333   Performance monitoring interrupts
 IWI:    1988693      18746      18117      17887      17481      14464      15398       9936      11229      10753      10456      10016   IRQ work interrupts
 RTR:          0          0          0          0          0          0          0          0          0          0          0          0   APIC ICR read retries
 RES:     315954     320882     121135     113525     104175     104204     104778     157055     115602      97898     107991      96309   Rescheduling interrupts
 CAL:    4848398    4956536    4709393    4778520    4837508    4771240    4914706    4439478    4504739    4622628    4577628    4500259   Function call interrupts
 TLB:    4136148    4424857    4370445    4490762    4563282    4514497    4649105    4190002    4245703    4359975    4319764    4248345   TLB shootdowns
 TRM:          0          0          0          0          0          0          0          0          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0          0          0          0          0          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0          0          0          0          0          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0          0          0          0          0          0          0          0          0   Machine check exceptions
 MCP:        253        253        253        253        253        253        253        253        253        253        253        253   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0          0          0          0          0          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0          0          0          0          0          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0          0          0          0          0          0          0          0          0   Posted-interrupt wakeup event

 

0 Kudos
dseomn
Beginner
4,205 Views

It crashed again after 45 hours, with the max_cstate parameters that previously worked for three and a half days. I now suspect that previous time was just random chance and it would have crashed eventually if I let it run longer. I think my next step is to disable logical CPUs 1 and 7.

BOOT_IMAGE=/vmlinuz-5.13.0-rc1-dseomn-drm-tip-2021-05-16-7d383e16a8e1 root=/dev/mapper/solaria--vg--ssd-root ro console=ttyS0,115200n8 ignore_loglevel processor.max_cstate=0 intel_idle.max_cstate=0 splash
[161632.882221] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[161632.888423] rcu:    7-...0: (0 ticks this GP) idle=c56/1/0x4000000000000000 softirq=4144529/4144529 fqs=2617 
[161632.898554]         (detected by 4, t=5256 jiffies, g=8381009, q=766)
[161632.904653] Sending NMI from CPU 4 to CPUs 7:
[161635.955567] mce: CPUs not responding to MCE broadcast (may include false positives): 1,7
[161635.955568] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[161636.988584] Shutting down cpus with NMI
[161636.988584] Kernel Offset: 0x1ba00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
0 Kudos
David_G_Intel
Moderator
4,190 Views

We noticed that you have contacted Intel Customer Support directly regarding the same issue here described and we found out that you have an internal support case open. In this case, we will proceed to close this thread to avoid duplication of effort and the support will continue through the internal case.


Best regards,

David G.

Intel Customer Support Technician


0 Kudos
Reply