Processors
Intel® Processors, Tools, and Utilities
15524 Discussions

What is actually wrong with Raptor Lake?

NubMan55
Novice
23,616 Views
I have a 14900k that keeps increasing in instability. I have posted here about the issues before. High-end Raptor Lake chips seem to be breaking down on the daily, even CPU's used in server environments seem to have a failure rate of ~25%. Intel as a company is not giving us, the consumers, any updates into their investigation. I have been offered a replacement CPU/refund by you guys before. I do not believe that is a good deal for me since as of right now it seems if I replace the CPU, I'll be needing a new one in a few months. If I get a refund, I'll be stuck with an expensive motherboard that only works with your CPU's that seem to turn to sand in a matter of months. I'd like to receive some form of an update as I'm getting fairly frustrated.

Please do not copy paste the same "Intel is aware of the problems experienced in certain workloads and is working with blah blah". I require a proper answer. If the root cause is truly unknown, so be it. It seems Intel as a company have weighed the cost of silence vs doing the right thing and have decided to stay silent until next gen release and hope customers forget and suck up the cost.
0 Kudos
20 Replies
VonM_Intel
Moderator
22,982 Views

Hi, NubMan55.

Thank you for posting in our Community.

I understand that this situation is incredibly frustrating, especially given the repeated issues and lack of updates from Intel. Intel and its partners are continuing to investigate user reports regarding instability issues on Intel Core 13th and 14th generation (K/KF/KS) desktop processors. We appreciate the Intel community’s patience on the matter and will continue to share updates on the investigation as it works toward a conclusion. In the meantime, we’re sharing an update on confirmed factors leading to the reported instability issues and Intel’s current guidance to users regarding Intel Core 13th and 14th Generation (K/KF/KS) desktop processors.

While I cannot provide a definitive answer from Intel regarding the root cause of the issues, I can ensure that your case is documented and that you are informed of any updates or changes as soon as we receive them. Intel analysis has determined a confirmed contributing factor to the instability reports on Intel Core 13th and 14th Gen (K/KF/KS) desktop processors is elevated voltage input to the processor due to previous BIOS settings which allow the processor to operate at turbo frequencies and voltages even while the processor is at a high temperature.

However, in investigating this instability issue Intel did discover a bug in the Enhanced Thermal Velocity Boost (eTVB) algorithm which can impact operating conditions for Intel Core 13th and 14th Gen (K/KF/KS) desktop processors. We have developed a patch for the eTVB bug and are working with our OEM/ODM motherboard partners to roll out the patch as part of BIOS updates ahead of July 19th, 2024. While this eTVB bug is potentially contributing to instability, it is not the root cause of the instability issue.

 

Just to share with you, we have our support link where you can find some relevant information regarding the 13th & 14th gen processor stability. To make it easier for you, here’s our support link: June 2024 Guidance regarding Intel Core 13th and 14th Gen K/KF/KS instability reports

Thank you again for your patience and for bringing this to our attention. We genuinely want to resolve this issue to your satisfaction.

 

Best regards,

Von M.

Intel Customer Support Technician


0 Kudos
NubMan55
Novice
21,841 Views
I understand you're not at liberty to say what the current suspicion on Intels side is as to what the root cause is. My following rant is NOT directed at you, I will be expressing my dissapointment towards Intel as a company.

Game devs and tech youtubers have taken datasets and extracted the affected CPU's and hypothesized different options for the root cause. This data has been collected in a smaller amount of time than Intels own investigation. We as consumers have received more information about this incident from third parties than we have from Intel.

We, the consumers, were promised a proper statement months ago. I'd assume Intel has a strong suspicion as to what the root cause is, my estimate is that Intel most likely has confirmed it by now and is trying its best to damage control. If not, it's just straight incompetense. Either way, it's terrible for the consumer. Intel being silent is VERY bad for PR. It leaves room not only for speculation, but research by third parties that might not be as thorough as Intel's own investigation. I believe most consumers would find releaf if Intel updated us on the results of the investigation weekly, even if the results are inconclusive. There is no way Intel has not hired a very smart marketing team alongside lawyers to estimate what is the best course of action to save face and reduce damages. If Intel is staying silent, it leads me to believe the root cause would only be fixable VIA downclocking the processors so much it'd result in a lawsuit or a total recall. Therefore my current estimate is that Intel, in its silence, is lawyering up and dealing with RMA's until they run out of chips to replace the degraded ones with and hoping Arrowlake release would make people forget and give them time to RMA the dead chips until warranties run out.

The answer you gave is a BS answer and you know it as well. If voltage is the issue, using the default profile that you recommended is a highway to hell as it increases vcore alot. For me it caused vcore to increase from a maximum of 1.42 (light loads) to over 1.57v. I know it's not your fault to any extent as a member of the customer service staff. You most likely have been given strict instructions as to what you can and can not reveal. Again, just sharing my frustrations towards the business practises. I'd like to find out what is going on as I bought my PC from a third party as a custom built system and I want to know if my best course of action is to try and get the CPU and MOBO swapped for AMD or if Intel is going to fix this.
CoolBook
New Contributor I
20,150 Views

You seem to know what the problem is and how to fix it. Just do it, don't rely on Intel.

 

Intel overclocked the top end K SKUs out-of-the-box as a feature. They used to only unlock K-chips, handing over the overclocking responsibility to the customer. I suppose the competition made them desperate. In all fairness, the other company is not that much better in the overclocking regard. I hope they both cool off a bit - pun intended.

 

On top of that Intel did a poor job with the actual overclock, probably requiring at least a 420 radiator for the i7 models.

Like all unstable overclocks, certain apps would crash at high load. In this case it seems UE5 games with Nvidia GPUs is the trigger. Unfortunately Intel seems to have missed that during the RPL tuning and stability testing.

 

To summarize: It's not all about an actual hardware issue, as in defective silicon. Though the product might become even worse if pushed too hard repeatedly.

 

I have no idea if Intel is planning to improve the actual unstable calculations/instruction(s) with the microcode update. That combined with lower voltages might actually be a 100% fix, for undamaged CPUs. The performance my dip a bit though.

 

All in all, I don't think the constant nagging on Intel is that constructive. So far I don't see anything wrong with the statements provided.

 

I own plenty of RPL systems. Not a single issue, but I would never allow them to run over 1.3V and 75°C.

0 Kudos
NubMan55
Novice
20,146 Views

I don't think limiting my CPU at 1.3v is an option. Its default vid table is asking for over 1.6v on 2 cores (1.628 on an e-core and 1.616 on a p-core) with most asking for 1.59+.

0 Kudos
Ender79
Beginner
17,275 Views

I have a laptop Acer Nitro AN17-51 (I7 13700H + RTX 4600) .... First instability issues I noticed when I was playing The First Descendant ,constat crash at compiling shaders wich is a 100% CPU job, and game keeped crash randomly even in game .

I reinstalled the Nvdia Drivers , was even worst ...  because the new driver will delete Shader Cache and for the first time laptop crashed when it had to compile new shaders for Farcry 6. But it succeded compling shaders on FarCry 6 afer 3 atempts  then was OK , but with FarCry, First Descendant keep randomly crashes.

I found out that is a problem with Raptor Lake CPUs , but I only own a weak i7 13700H , only 115W and capped at 5 Ghz max turbo, that means the voltage/power/temps is not really the root problem, is deep in the arch of CPU.

I installed ThrottleStop in order to undervolt CPU, but heyyyyyy....only HX CPUs support that , not mine.

I only managed to set PL1/PL2 at 95W from 115W , and PROCHOT set to 95 degrees from default 100 degrees .

I hate when my CPU is thottling at 95 degrees,is not as fast as it was before , but at least is 90% stable  in First Descendant , I can play the game few hours continously and maybe after that I get a crash.

NO, THE PROBLEM IS WIDE SPREAD IN 13/14 RAPTOR LAKE CPUS ,EVEN LAPTOP CPUS NON HX SUFFER FROM THE SAME ISSUE!!!!!!!!!!!!

I BET YOU WILL NOT SEE NEW INTEL GENERATIONS CPUS RUNNING AT 6+ GHZ ANY TIME SOON!!!!!!!

 

0 Kudos
patrock
Beginner
8,274 Views

Although it is said that laptop CPUs aren't affected i have similar issues on my laptop (13700H with RTX 4060). The first months everything was good, then there were single BSODs, now i have plenty of them... Could not find the issue, replaced RAM, reinstalled windows, updated all drivers. 

 

The BSODs come in in all forms, mostly depends on ntoskrnl.exe, could not identfy driver issues. 

 

It can be another hardware defect, but it seems strange...

 

 

0 Kudos
igor-skobeliev
13,203 Views

Hi, @VonM_Intel .

I am planning to build a system based on Intel® Xeon® E-2468. It is hard to understand, some sources saying that all Raptor Lake CPUs affected by this issue, but you spot only Intel Core 13th and 14th Generation (K/KF/KS) desktop processors.

The issue appear for a long time, I believe you collected enough statistics and cases. Does E-2468 is in affected list? Is it safe to buy this one?

 

Best Regards,

0 Kudos
igor-skobeliev
8,819 Views

Hi @VonM_Intel,

Any updates here? You guys saying about transparency, but keeping silent on such questions.

Is is safe to buy Xeon® E-2468 now?

0 Kudos
VonM_Intel
Moderator
21,045 Views

Hi, NubMan55.

I understand your frustration completely, and I appreciate you sharing your concerns openly. It's clear that you've invested in Intel's products and expected transparency when issues arise. Intel's handling of this situation has left you feeling disappointed and uncertain about the future. Your insights about third-party information versus official communications highlight a significant gap in transparency that's affecting many consumers.

I wish I could offer more concrete answers or solutions right now. Rest assured, your feedback is valuable, and I'll ensure it's passed along appropriately. I will need to do further research and I'll coordinate with our team regarding this serious matter and post the response on this thread once available.

 

Von M.

Intel Customer Support Technician


0 Kudos
VonM_Intel
Moderator
20,224 Views

Hi, NubMan55.

I appreciate your patience with this concern. However, Intel continues to work with our partners to analyze and determine proper mitigations regarding the reports of instability on Intel Core 13th and 14th Gen unlocked desktop processors in certain workloads. We will share updates on the analysis when it becomes available.

 

While I cannot provide specific details about the ongoing investigation, I assure you that we are actively working to identify and resolve the root cause. Your feedback is invaluable, and I will ensure it is communicated to the relevant teams.

 

Again, I apologize for any inconvenience this has caused and appreciate your understanding. If you have any further questions or need additional assistance, please do not hesitate to reach out.

 

Best regards,

Von M.

Intel Customer Support Technician

0 Kudos
NubMan55
Novice
20,129 Views

Hey!

 

Thanks for your response. I'm curious about the fix provided by Intel. The Intel Default Profile on my MSI Z790-A Pro Wifi DDR5 motherboard has increased my vcore by approx 100mv compared to the motherboards defaults. If voltage is the issue, would this not make it even worse? the default profile has acll=dcll=1.1mohm. Is it safe to change this to say 0.5 and 0.5? Intels documentation only mentions 1.1.

I will most likely be RMAing my CPU through the vendor I bought my PC from, but I need to know how to ensure the safety of the new CPU. Prior to troubleshooting this issue I haven't touched a BIOS on any system other than to enable XMP so I am not too familiar with what is safe to manually change in the BIOS.

If the thermal velocity boost is having an issue ATM, locking the core ratio to a set amount would be a temporary fix, right? What should I set the core ratio as? 57x? 56x? Lower?

What about the performance loss experienced with these fixes? I paid i9 money to receive a CPU that is now performing worse than a 14700k was supposed to.

0 Kudos
CoolBook
New Contributor I
20,109 Views

There are two sides of the coin when it comes to voltage.

Too high voltage will degrade the CPU, especially when running hot.

Degradation - meaning it would require higher voltage to run stable at a specific frequency.

This would in turn make the CPU unstable at the voltages already set by Intel during manufacturing of the processor. 

1. Increasing voltage is a way of stabilizing the CPUs.

2. Updating the microcode that allowed high boost clocks/voltages, even when the CPU was hot, is a way to prevent accelerated degradation from happening.

 

Personally I have never had the proper cooling to use the RPL i9 CPUs. I have the 13700k though.

I don't use my system for rendering or similar tasks. I want good single core performance for gaming, surfing etc.

So because of that I disable Hyper-Threading. This reduces a lot of heat and increases stability.

Then I lock all P-cores at 5.5GHz, and all e-cores at a safe frequency. After that I undervolted as much as possible while keeping everything stable. The result is an amazing system that tops out at about 200W, which I use LF2 420 to cool.

I have used this computer for over a year now, and it never crashed once (except during tuning).

Some people complain about latency or lagging in Win11. Personally I disabled core parking to make the system even snappier.

 

I'm not saying that this is what you should do, and I don't make any claims about longevity and safety. This works well for me, and I recommend people to get some guidance on overclock.net or similar forums.

 

Unfortunately the K SKUs don't really work without proper manual tuning.

0 Kudos
NubMan55
Novice
20,093 Views

Hey!

 

Thanks for your response. I'm afraid to undervolt as getting any form of acceptable behaviour would require one to disable CEP. As this is a part of the intel default profile, I'm afraid disabling it would give Intel an avenue to claim the CPU was misused and would not offer warranties.

0 Kudos
CoolBook
New Contributor I
20,080 Views

Well, then you rely on Intel to fix your problem. I suppose you are looking for a refund or some kind of compensation?

Intel made a mistake selling these products to "ordinary customers". There is actually no perfect solution to that problem.

0 Kudos
NubMan55
Novice
20,031 Views

My ideal situation would be to get an offer to move to AMD by the place I bought my system from. If intel offers a refund, it doesn't rly do me good as I'll be stuck with a motherboard worth 300euros that is useless.

0 Kudos
bpw
Beginner
6,636 Views

Having to disable CEP for undervolting is old lore. If done right, there is no need to disable CEP or any other of the Intel protections.

However, doing this involves a bit more than just pushing down AC LL and calling it a day. Good results have been achieved with adaptive vcore with a manual offset. Doing so will never trigger CEP for clock-stretching.

If in addition to that AC/DC LL shall be lowered, than this has to be done in conjunction with proper LLC tuning to prevent CEP from triggering. It can be done and has proven so by many already, but the highest benefit comes from just the above adaptive vcore+offset procedure.
Definitely worth a try, worked quite well for me (LL AC/DC = 80/110, LLC default, adaptive vcore with manual offset of -120mV for a 13700K).

 

 

0 Kudos
NubMan55
Novice
20,095 Views

Adding to my last post; how will Intel handle the situation if claims such as this: "Get up to 6% better Cinebench 2024 multi-core performance on the Intel® Core™ i9 processor 14900K, compared to the AMD Ryzen 9 7950X processor." (as found here https://edc.intel.com/content/www/us/en/products/performance/benchmarks/desktop/) get invalidated as a result of chasing stability apart from the absolute top-bin CPU's? Will you be hand-picking top-bins and replacing those to people that RMA?

 

I'm very frustrated I paid a lot of money to receive a CPU that has its performance cut down to a cheaper processors level, and not only that, apparently have my CPU degrade due to a microcode/bios bug alongside it, causing many hours of my rare spare time to be lost to troubleshooting the issue.

0 Kudos
CoolBook
New Contributor I
19,608 Views

Hi @NubMan55 

I don't see what cooling solution Intel used for that testing.

That is usually the problem when comparing results.

0 Kudos
bpw
Beginner
6,631 Views

That's why you'll never get an honest answer from Intel here.

Fact is that Raptor Lake was pushed to the breaking limits in order to go head-to-head with AMD's benchmarks. All manufacturers will  use "golden sample" processors and elaborate cooling solutions for their benchmark marketing purposes.

The average user, trying to achieve the advertised performance figures, will have a hard time doing so without risking damage to their CPUs.

Intel needs to put their foot down and come up with binding default settings that motherboard manufacturers must implement in their BIOS defaults out of the box, accompanied with honest performance figures that can be expected with such a setup and a clearly specified cooling solution. 

For example, it is completely pointless to advise PL1=PL2=253W if only top of the line cooling solutions can sustain this without running into thermal throttling before. Even with a lower PL1, PL2=253W is still difficult to sustain during the default 56 second boost period.

So, in the end, power limits will have to be drastically reduced for most contemporary case and cooling systems, thus not reaching the advertised performance numbers.

 

 

 

 

0 Kudos
Eisbar
New Contributor I
6,553 Views

@NubMan55 


As others have said, including Intel in their roundabout way. The problem with Raptor Lake is voltages and frequencies have been pushed beyond reasonable limits for Intels one-hit-wonder 10nm process.  

 

There's some history here that's important that gives a better picture.

 

In 2012, Intels director of Process Architecture Mark Bohr made a bold claim at Intels Developer Forum that Intel was going to have their 10nm node in production in 2015.

""The 14nm technology is in full development mode now and on track for full production readiness at the end of next year," Bohr said."

"While 10nm processors, code-named Skymont, are on tap for 2015, 7nm and 5nm architectures are also in the pipeline beyond that," Bohr said.

So two important things here. Bohr had stated 14nm would be at full production readiness by Q4 2013, and Intel 10nm would be ready by 2015.

In September of 2013 Intel revealed a notebook that used their new Broadwell CPU, built on 14nm and in this use case demonstrated on a low power device. Intels CEO at the time Brian Krzanich stated that 14nm Broadwell would be shipping by the end of 2013

That did not happen. 

It is now August 2014. Intel announces their Core M line of CPUs, their first product to be build on their 14nm process they say. The first Core M products become available at the end of 2014.

It is 2015, 14nm is only just becoming widely available. Intels promised 10nm from 2012 is nowhere to be found. The 14nm process bring forth Skylake in Q4 2015. This is Intel 6th Gen.

It is 2018, Intel has still been unable to deliver 10nm. They launch and sweep Cannon Lake under the rug. Cannon Lake was due in 2015, Intel announces to their investors they are shipping 10nm now in low volume and expect things to turn around in 2019. 

Cannon Lake is a disaster of a product and it wouldn't surprise me if you never heard of it. The CPUs were only able to made for low power applications, the onboard GPUs were defective and had to be disabled, etc.  Cannon Lake yielded exactly one CPU model, that's all. It was axed by Intel in Q4 2019. Cannon Lake gave us the Palm Cove cores.

 

Intel has been unable to innovate and get their 10nm process off the ground. 

Intel 10th Generation is announced in 2019 and arrives in 2020, this is Comet Lake, built on the Intel 14nm process and is the third revision, or reiteration, or broken record if you will. The 14nm 10900K is released in the Summer of 2020.

Jim Keller resigns from Intel and it basically boils down to Intel refuses to outsource their fab.  This was a big deal.

 

It is now 2021. Intel 11th gen Rocket Lake which was supposed to be built on the 10nm process has been backported to 14nm, Intels 10nm is still unable to perform in anything but a low power capacity. Backporting came with consequences and again, Intel has swept things under the rug. 

Intel 11th generation was catastrophically defective but people seem to forget that.

Screenshot 2024-08-28 083356.png

 

Pugets data is unintentionally biased because of the workloads that their customers will most likely be doing and the problems of Raptor Lake will be less experienced due to the amount of baked in error handling in content creation software and I can elaborate more on this if need be.

Alder Lake has arrived. Success has now been achieved on the Intel 10nm node that they've rebranded as Intel 7 because people have short attention spans and it's best to capitalize on that I guess. 

Alder Lake is the absolute limit of what the 10nm can provide. The cap is around a 12900K, I say that because Intel silently axed the 12900KS only recently when the media began to pick this story up. 

A node that Bohr promised in 2012 to be ready in 2015 is seven years late.

So Intel had a choice. 

Tell the investors that a node that took seven years to deliver will be a one hit wonder or hope that you can kick the can down the road long enough for people to buy the idea that you're on track with yet another bold claim. To admit Alder Lake was the maximum that could be squeezed from such a resource sink that had gone on for so long would have leveled Intel. I have no doubt in my mind about that.

We now know that their node roadmap is again, gone to pot. Lunar Lake is pure TSMC, Arrow Lake is allegedly pure TSMC on mid-to-high power SKUS.  Lip-Bu Tan has resigned from Intel over disagreements with the management of the company. Tan is one of the most respected people in semi from what I know and I think it's saying something to the world that he has decided Intel is not for him.

This is a red flag just as Jim Kellers departure was.

Just circling around, the 12900KS SKU was axed, but not a large portion of Alder Lake SKUs, Intel 13th Gen SKUs were axed as well, but seemingly only the K SKUs. Actions again speak louder than words.

12900K had a max boost clock of 5.2GHZ and its fused VID was on average eh 1.33v for this. 

12900KS had a max boost clock of 5.5GHZ and its fused VID was on average ~1.40-45v

In Intels passthrough QA document for handling customers questions Thomas Hannaford says some important things.

 

Q: Why aren’t we seeing this issue on prior gen unlocked desktop processors?
A: Based on Intel’s analysis to date, Intel Core 12th Gen desktop processors are not at risk due to
lower voltages and turbo frequencies compared to Intel Core 13th and 14th Gen desktop
processors.

Q: Is Intel declaring elevated voltages to be root cause of the instability issue?
A: Incorrect voltages are one aspect of Vmin Shift Instability issues. Intel has delivered a microcode
patch (0x129) as a partial mitigation addressing exposure to elevated voltages which is a key
element of the Vmin Shift Instability issue.
To date, three mitigations have been identified related to this issue:

 

Mitigations are not solutions, they are meant to lessen the impact or severity by definition. That wording is chosen extremely carefully.

The problem is the processors themselves and what Intel sold them as being capable of. They fail in a very specific way and have been doing so since 2022, I've spoken about this before and how specific cores fail in very specific way. IE: P-Core 5 and 6 will most likely fail at the highest rates declining to 0% with Cores 0 and 1 due to their location on the die. This can be demonstrated. 

 

In the end I don't think a recall will solve anything at this point, it should have happened when Intel became aware of these failures from the beginning back in 2022, but it didn't. I think Intels actions have landed it where it is today and my sympathy goes out to the employees who had no say in what took place but are taking the brunt of it, either being let go from their jobs to cut costs due to managements decisions or the customer service people who are probably being worked to death right now and dealing with some dorks I imagine. 

My hope is that Intel does right by their customers, each and every one of them. The customers are the unwitting shoulders that took the burden to keep investors happy for the past two years. 

Pat Gelsinger likes to keep talking about node leadership as a goal, it's good to have goals. I don't think having a more advanced node than any others under the circumstance is anything I would ever attach the word leader to though. I think that branching out to seek a leadership position in ethics and morality would make a fun little side project for Intel management. 

 

 

 

 

 

 

 

 

 

 

 

0 Kudos
Reply