Processors
Intel® Processors, Tools, and Utilities
15143 Discussions

July 2024 Update on Instability Reports on Intel Core 13th and 14th Gen Desktop Processors

Thomas_Hannaford
Employee
402,908 Views

*Update 7/29 regarding customer support process guidance (included below as well): https://community.intel.com/t5/Processors/Clarification-Update-on-Intel-Core-13th-14th-Gen-Desktop/m-p/1618462

 

Based on extensive analysis of Intel Core 13th/14th Gen desktop processors returned to us due to instability issues, we have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor.

Intel is delivering a microcode patch which addresses the root cause of exposure to elevated voltages. We are continuing validation to ensure that scenarios of instability reported to Intel regarding its Core 13th/14th Gen desktop processors are addressed. Intel is currently targeting mid-August for patch release to partners following full validation.

Intel is committed to making sure all customers who have or are currently experiencing instability symptoms on their 13th and/or 14th Gen desktop processors are supported in the exchange process.

To help streamline the support process, Intel's guidance is as follows:

  • For users who purchased 13th/14th Gen-powered desktop systems from OEM/System Integrator - please reach out to your system vendor's customer support team for further assistance.
  • For users who purchased boxed/tray 13th/14th Gen desktop processors - please reach out to Intel Customer Support for further assistance.

 

Labels (1)
252 Replies
ChuongHoang
Beginner
1,965 Views

Xin lỗi vì trình độ tiếng Anh hạn chế của tôi. Tôi là người dùng và trả tiền cho trải nghiệm. Liệu sản phẩm có bị hư hỏng trong tương lai hay vẫn có thể sử dụng được không. Nhưng điều đáng nói ở đây là có thể có sự lừa dối khách hàng khi phát hiện ra vấn đề. Không ai có thể chấp nhận chi một khoản tiền nhỏ để nhận được thứ gì đó khó chịu và đáng sợ.

0 Kudos
0syris
Novice
1,109 Views

This has been an issue since the release of 13th gen which was in October of 2022. That means my i-7 13700k has been degrading for almost 2 years now. What do you plan to do to conpensate the customers who have had this problem for two years now and now have a degraded CPU that is negatively affecting performance and expected/advertised product lifespan. Do you plan on replacing the degraded products? 

0 Kudos
Mistyshade
Beginner
1,256 Views

Intel officially annouced 2 more year Warranty for every Raptor Lake Desktop Process (Bulk AND Box)

Kal1979
Beginner
1,307 Views

So?
IF they accept your RMA (if not, try again, that’s what they’ve already said, and still is not clear if the extension involves only boxed cpu, because they issued 3 different statement saying 3 different things), anyway they will give you a new defective CPU in exchange of your old defective CPU, till the end of the warranty, then you’re on your own (normally a CPU life is 10+ years)… so what’s the point?
This without counting that the microcode update will probably slow down your processor enough to expand its life till the end of the warranty period, who cares if you bought a CPU basing your choice on the advertised performances.
I don’t wanna underperforming sh_it in substitution of other degrading sh_it: I want my money back.

0 Kudos
Mr_FranCk
Novice
1,471 Views

1- When you get a replacement through RMA, the warranty time starts over. So from expedition, another 5 years.

2- I currently own 3 cpus affected by the list, but none of them have subpar behavior or any sign of degradation. They always score the same on cinebench among others. Depending on your settings, your CPU may have no issue at all. 

3- The behavior the microcode is going to address is in fact to limit voltage to the cores TO THE SPECIFIED FACTORY SETTINGS. That should not impact the efficiency. Well, it should but in a good way. The actual settings are often causing thermal throttle and the new microcode should fix that.

4- 10 years? So you went from a 2nd gen cpu to a 14th gen?  

0 Kudos
Kal1979
Beginner
1,505 Views

1)I went from a 9600K, but I still have core 2 quad functioning. So?
My 14700k after 3 month undervolted already doesn’t perform like 3 month ago, but that’s not even the (only) point: it is trust, knowing HALF of what I know now, I’d never buy that CPU again: this is called SCAM.


2) it’s really SO CUTE how passionate you are defending a multi billionaire company who HID for 2 years a manufacturing defect, and still did not clarify when it was solved and which batch of cpu where corrupted… amazing.

3)warranty policies are issued according local laws, they’ll be different in USA, EU, Southamerica, Asia, etc: in every country they’ll exploit any legal caviat the different laws present, IF some authority will not held them accountable (somewhere it will happen, somewhere else not): it will be a HUGE MESS (for consumers). Warranty thing is just a just a ‘cosmetic’ patch to limit the image damage… 

4)there are people out there, A LOT OF PEOPLE, who just buy a computer and use it. they’re not used to updating bios, changing voltages, changing mobo settings etc… many of them probably are not even aware of the situation: they’ll have no clue of what to do and they’ll end up screwed.

5)do you pay for a CPU working as advertised or for an RMA every 4, 6 or 12 month, between a BSOD and another?

6) do you really think the microcode update will solve the problem? LMAO
Cute!

7)do you really belive the microcode update will not touch the performances?!?
LMFAO!!!
Cutest!

8)You go to those companies that have thousands of Intel CPUs in their server farms, that have already lost hundreds of thousands of dollars due to crashes and interruptions of their services, and that are switching all the platforms from Intel to AMD, to avoid losing more customers, and ask them what do they do with the warranty extension you that you so much trumpet…
Or do you think that Intel lost 30% of its stock market value in a week because some kid couldn't play Remnant II….?

Mr_FranCk
Novice
1,372 Views
Now you’re the cute one, thousands of 13th and 14th gen in server farms? Companies that use thousands of cpus in server farms do not use that kind of processor.

You’re pissed, I get it, but if you were a bit knowledgeable, you would know that the voltage on those units have always been critical. Back in 2022, in the OC communities it was already a well known fact that the 13700k & 13900k required under voltage to perform over expectations.


I am in no way granting them a blank slate, but since I do know a thing or two about computing, electronics and electricity, I have little to no doubt that everything I said about the microcode is achievable. That’s if what the issue they’re pointing at is the culprit. Again, I do not give them a blank slate, but lying on that could be the final nail in the coffin and could lead to an hostile takeover or bankruptcy. I doubt they hire people that dumb. Call me what you want, I really couldn’t care less.
As for the “not everyone knows how to update a bios” there’s literally utilities from each board manufacturer to do so. At one point, you got to RTFM. I won’t say anymore since it’s pointless.
Kal1979
Beginner
1,327 Views

At the moment it’s not intel who is pissing me, it’s you. LOL
First thing I did after installig my 13700k, in June 2023 was undervolting (for thermal and consumotion reasons); then I sold it and went for a 14700k cause I found a good offer; first thing I did: undervolted it too, set Pl1 and Pl2 limits into reasonable value (250 W), because I was well aware that a lot of mobo had crazy values, basically unlimited, in that settings, even if Intel still HAD NOT issued is ‘guidance’ and ‘default baseline’ to those mobo vendors, it was april 2024; ou're not talking to a fool, I probably know more than you about intel motherboard settings, my first over clock was on a Pentium III 800mhz... where you was, at that time?
I’m not even so pissed, right now, my 14700K is still solid (3 moths, will see in the future) but knowing all of this I’ll not buy it again and I’ve lost so much trust that I’m even aftaid of the upcoming microcode update: it is sooooo blatant that Intel isn’t being clear and truthful an all theese question, that to me the outcome of whatever they do, at this point, is totally unpredictable.
on a legal stand point, Intel SCAMMED me (and everybody else): if I sell you something advertising certain characteristics, the you discover that those characteristics ar not sustainable and trying to sustain them will cause issues, you’d probably sue me, rightfully so.
With the 13th gen ‘oxidated’ cpu they sold DEFECTIVE CPU, KNOWING what they were doing, they know which the CPU involved are and they hide the infos: this is not incorrect, this is not only incorrect, this is arrogantly and incredibly illegal, and if it doesn't bother you, since you keep minimizing it, my dear, you have a big problem in putting unclear behavior, incorrect behavior and illegal behavior into perspective.
On the B2B side of the situations, there are a lot of article, youtube videos, news, X post etc, denouncing developpement studios and online services that are having trouble with 13th & 14th gen CPU and are switching to different solutions: I’m not inventing anything, here; Intel has lost 30+% of its stock market value because they are losing partnership and contracts of hundeds of thousands of CPU, not because me or my cousin had a couple of BSOD playing Minecraft or your Grandma panicked and sold her 5000$ of Intel stocks…
I'll leave you with the pleasure of the last word, because I won't waste any more time replying to someone who denies the facts, downplays illegal practices, and who I suspect has been trolling since the first message.

P.S. Intel will NEVER face hostility or bankrupcy: it is a key industry for US and it’s the one and only chip producer left in the west if China decide to take over on Taiwan: if Intel’d at any point face an existential threat, the US Congress will step in… again: It's so cute that you worry about the possible financial difficulties of a multi-billion dollar company, but you can sleep soundly...

lucasholt
Novice
1,180 Views

While many companies would buy xeons or a server chip from their competitor,  game companies buy consumer chips because of the high frequency for their game servers.  The reason is that it lowers latency for players and it's cost effective at the same time.  This has been widely covered by L1 techs and others.  

There are other people who run consumer chips in servers as well.  I am currently using a 11700 in a VM box in my basement to run package builds for my open source OS project.  I've also got a used HPE gen9 dl360 server with dual xeons now helping out.  For the past decade, I've mostly run consumer chips from both companies and not many server parts.  It's faster and cheaper to get replacement parts for consumer builds.  I had an issue with a supermicro xeon build about 10 years ago and went consumer for a time.  

A lot of people don't know how to update a bios.  My mother could never do it.  Many people who buy PCs fully assembled can't do it.   If you're a hobbyist who builds your own pc, it's something you need to learn of course. 

0 Kudos
CMOrozco
Beginner
1,525 Views

We do we get a microcode patch delivered by Dell/Alienware?

0 Kudos
Kal1979
Beginner
1,521 Views
Every intel cpu will have that: you will need to update yout bios once your vendor (asus, Dell, whatever) has released the bios update with the new microcode validated by intel.
0 Kudos
Keean
Novice
802 Views

Here's my take on what's going on, which is mostly informed speculation:

 

The oxidation issue is completely separate from any others. Chips with impurities in the metal layers will experience worse electromigration, so that the chip may fail after some time at voltages that on a 'good' chip may well cause little noticeable electromigration.

 

The other issue may well have the root cause as the eTVB error. This error, as far as I understand it, results in the CPU cores running at too high frequency when the chip is too hot. If we think about this, this will result in the chip crashing, unless the voltage is increased to compensate. What if the eTVB error caused the chips to be assigned too high VID voltages in order to stabilise the chip during testing and binning?

 

When the eTVB algorithm is corrected, the chip will no longer need such high VID voltages so as to not crash, it will be stable at a lower voltage. The first microcode release 0x125 fixed the eTVB algorithm but as far as I know did not alter the VID voltages. So the chips will be a lot more stable, and now have more voltage than they need to be stable, which will still result in accelerated electromigration and they will eventually start crashing again, but due to a different cause. Once a chip starts crashing due to electromigration, no microcode is going to fix it, you would just have to underclock it.

 

This could also be why these chips responded so well to undervolting, because they actually were running at too high voltage already, its just that undervolting made the crashing due to the eTVB microcode error worse, however this was only triggered by specific workloads which appear to be when two hyperthreads are running on the same p-core but the rest are idle. This was made harder to detect because ThreadDirector technology in the CPU and Windows scheduler actively tries to prevent multiple hyperthreads running on the same p-core until all the p-cores are loaded with a single thread, by which time the chip may be thermally throttling, preventing eTVB raising the clock speed anyway. So the eTVB error would only cause a crash in transient conditions when lots of threads are starting, or stopping at different times.

 

So I don't expect this August microcode to result in a performance loss, if anything it will be a performance gain as the chip will consume less power, and thermally throttle less, if my guess is correct, it will also reduce the stable undervolting potential for the chip, as it will remove the out of the box overvolting the chips have had up until now.

 

Anyway, this is just speculation, maybe one day Intel will explain it all to us.

Kal1979
Beginner
790 Views
so in a nutshell, you basically expect a micro code update to turn a bad product into a great one, and all of that with a micro code update… Unbelievable… Billions and billions spent on research and manufacturing processes and, in reality, all it took was a ducking micro code update!
If only they thought that before…!!!
0 Kudos
Keean
Novice
786 Views

There was a genuine error in the eTVB algorithm, that was a mistake, but why think everything else is suddenly bad? Do you think Intel forgot how to design chips? Most of that design intelligence is coded into the tools anyway, things like auto-routing and layout tools... these things don't suddenly go 'bad'. There is a lot of effort to get the design parameters right when moving to a new technology node, but "Intel 7" (10nm) has been used for several generations now, so the design parameters are well known.

0 Kudos
Kal1979
Beginner
772 Views

Well… surely they ‘forgot’ how to test an eTVB algorithm: it took TWO ducking YEARS to get that straight (hopefully)… LOL…

Intel 10nm ‘new technology’ is OBSOLETE, and CPU produced in that node are UNEFFICIENT… and the prove is that Intel 15th gen will be on TMSC 3nm……..

0 Kudos
Keean
Novice
716 Views

The eTVB problem was made harder to detect due to the behaviour of ThreadDirector. The software was designed to optimise performance on big-little architectures, but as a side effect it avoids the very conditions that trigger the eTVB problem. Also the chips "mostly" worked just with raised VID voltages. There was definitely some QA missing around test cases for this error, and validating the expected design voltage against the actual voltages requested in testing. I expect they validated the test voltages when first deploying Intel 7 (10nm) fabs, and then because the technology node was not changing just assumed it would still be correct when movinf from 12th to 13 gen, and again moving from 13th  to 14th gen.

 

I can consistently cause a brand new 14900ks to trigger the bug using Linux by setting the CPU affinity to both hyperthreads in the same p-core, and using GCC to compile a large program - but it takes these specific circumstances to trigger it. It occurs so infrequently under "normal" usage that most people would think its a software error.

 

However a 13900ks does not show the same issue, I assume this is because a 13900ks with 200/300MHz lower clocks is not as close to the edge. This is stable at 1.44V 6GHz even with the eTVB bug. There is probably some silicon lottery here, but it does not look like the 13th gen are affected by the eTVB bug as badly as the 14th gen.

 

Intel 7 is "well tested" in a glass half full kind of way. In any case I think Intel do deserve to take some flak for this, however I think there is enough that they have really done wrong here, and yes it could well be one line of microcode ruining a billion dollar product. For example consider the Ariane rocket that exploded because one piece of software was calculating in metric units, and the other in imperial. Simple coding mistakes can cause billions of dollars of damage (also looking at CrowdStrike here). In a way the microcode may well be less validated because it can be changed - whereas errors in the masks used to make the chip can cost millions to put right, so a lot of validation is done.

 

There are some benchmarks now for microcode 0x129 and there are some improvements, and some losses, on the whole there is not much in it. That was comparing to 0x123. I would like to see where 0x125 fits in, is most of the performance difference between 0x123 and 0x125, and only the voltages change from 0x125 to 0x129? That would be interesting.

0 Kudos
Mr_FranCk
Novice
484 Views

I have about the same POV and I was called a corporate lover, fanboy, etc .

To be honest, I'm still waiting a few days before updating just to see if anything peculiar pops up, but other than that and from the benchmarks i've seen, it seems promising.

0 Kudos
Mistyshade
Beginner
715 Views

ASUS just published the new BIOS Updated with Microcode 0x129. But it's written it only concerned NON-K processor.

Also they writted "Beta" on it which mean it's not sure to update yet, so be carefull with those update

 

I didn't check for other constructor but the update seems to slowly be deployed.....

0 Kudos
pressed_for_time
New Contributor III
636 Views

ASUS have released a Beta BIOS including the 0x129 microcode for its Z790 boards only. It works on all the processors that fit this board including the 13th and 14th gen K/KF/KS chips.

I would not have any concerns about this BIOS - the Beta status is to allow for feedback from users before releasing a final version. The few times that I have used a Beta BIOS from ASUS I have not had any issues.

The final BIOS versions will be available for all the ASUS motherboards that support Intel 12th, 13th and 14th gen processors. I guess it will only be another week or so before they arrive.

0 Kudos
Fede59
Novice
225 Views

No, it says "The new BIOS includes Intel microcode 0x129 and adjusts the factory default settings for the non-K processors, enhancing the stability of Intel Core 13th and 14th gen desktop processors."

 

That and can change the meaning of the whole phrase if read wrongly. Poor wording in such a delicate moment? Yes

0 Kudos
drasberry
Beginner
400 Views

BSOD crashes are rarely due to hardware issues with a processor. They can be triggered by any out of spec code or perceived security threats from obsolete software or device drivers. Windows 11 is particularly subject to security BSOD's due to protections against executing anything perceived as potentially malicious code.

I run an obsolete video editing console built specifically for my NLE in 1991 under a 15+ year old 32 bit driver. It runs via an FTDI RS232 to USB serial adapter. It runs as COM3 in isolated port to port protocol. 

I am reverse engineering a modern replacement for these consoles. I use a software package to capture serial codes by setting up a shared serial port configuration in Windows.  I can passively monitor the polling strings and responses without issue on COM3, but any control code executed on the console immediately triggers a security BSOD due to the obsolete unregistered driver. 

I can monitor the console via the properly registered 64 bit FTDI USB driver without issue. 

Updating BIOS only may not cure some BSOD issues if the OS, motherboard chipset, NIC, GPU, audio, accessory hardware or any other system drivers and software are not up to date too.  I use a third party utility to scan and keep all drivers up to date since many of them I would not know about otherwise. On my complex setup there are more than 30 of them. 

Overclocking systems to maximum perfomance limits is by definition operating them beyond spec and subjecting components to thermal and voltage stresses that would not exist otherwise. So give Intel a break.

If your car had an engine failure because you overrevved it the manufacturer would void the warranty not give you a replacement and extend the warranty.

Intel's native turbomode overvoltage issue is a code issue that is being addressed. Thermal damages from code triggered out of spec operation by either Intel errors or user errors is not automatically an indication of a defect in the CPU silicon. 

0 Kudos
Reply