Mobile and Desktop Processors
Intel® Core™ processors, Intel Atom® processors, tools, and utilities
16771 Discussions

Intel(R) PCI Express Root Port #3 - 7ABA

Eershaa
Beginner
3,348 Views

Recently came across an issue of WHEA Error. Issue with Vendor ID : 8086 , Device ID : 7ABA. I have the Port #3 issue. 

I was told that it could be that my SSD PCIE Slot is failing or my SSD is failing. So i changed the slot and put my NVME drive into a different one recently. Now every time I turn on my PC , i get this same WHEA error. Like two weeks back i had 110 WHEA error, and now today its 152 error of the same code.

I did updated my BIOS , but the error keeps piling up. Any help is appreciated.


0 Kudos
8 Replies
PC1997
New Contributor I
3,297 Views
Yup. I have two PCs one with a i7-14700F and the other an i9-14900KS. Both experience this exact same error. The difference being the pc with the 14700F only generates this error once a day. While my 14900KS gets 10s of Thousands in under 5 minutes and is severely degraded. My previous 13900KS also had this problem after a while too. (14900KS was its replacement. I bought both, and in my case, was/is caused by me overclocking.)

So these errors are caused by your CPU. When it comes to WHEA 17 errors, it is important to point out, they can be caused by something as simple as a cable not properly connected/damage, or incorrect bios settings. You may need to update your BIOS. I can tell you, in my experience, nothing makes them go away until you replace the processor.

Do you have other problems such as games or applications crashing frequently? If you cannot locate the source of the problem, I would consider opening an RMA with Intel.
0 Kudos
Eershaa
Beginner
3,268 Views

Feels like you actually do understand my issue.

 

So there's been this issue and an another one which I'm gonna post links of here. These will tell you exactly how everything started and how i came to this new WHEA issue.

 

Link 1 : https://forums.tomshardware.com/threads/this-noise-is-coming-from-somewhere-in-my-pc-and-i-tried-all-this-any-help-is-appreciated.3874255/

 

Link 2 : https://forums.tomshardware.com/threads/intel-r-pci-express-root-port-3-7aba-issue-with-whea.3874346/

 

Link 3 : This is of someone else who faced the same problem and was looking for help , https://forums.tomshardware.com/threads/help-with-whea-logger-error-event-id-17.3771620/

 

Link 1 tells you in details, in comments too, what happened with the GPU , how Im trying to find a solution, my PC case airflow.

 

Never had any frequent crash. Never ever BSODed . My BIOS is up to date. Just now I turned on my PC , used SPECIFY, and now its 156 WHEA error. So im guessing now its less frequent than before as now the error only increases with every time I turn the PC on. But still, I'm not sure how much worse this might get in the future.

0 Kudos
PC1997
New Contributor I
3,244 Views
Your first problem with the noise you're hearing while gaming or running BurnMark...I mean FurMark (bad experience back in the day with this program lol) is actually caused by "Coil Whine". Some gpus do it more than others, but Gigabyte is known for coil whine with their 40 series GPUs.

https://www.intel.com/content/www/us/en/support/articles/000032539/processors/intel-core-processors.html#:~:text=The%20CPU%20itself%20cannot%20generate,or%20in%20the%20power%20supply.

Also, the WHEA 17 errors you're getting can definitely be caused by drivers... either incorrectly installed or corrupted, (degraded CPU trying to install them) in which case reinstalling the drivers should fix the error. But I'm willing to bet in your case, your error is referencing the PCIe Express connection to your GPU which runs through your CPU, not the chipset on the mobo.

You can test this by going into your bios and disabling 'PCI Express Native Power Management and ASPM' functions and re-test to see if the errors continue. The other option is to set PCI gen mode 3.0 instead of the default 4.0. but honestly nobody wants to run their GPU with reduce performance (although negligible), but if you just want those errors to stop, it's a solution.

The thing here is by doing this, you're not solving anything, you're just reducing the likelihood of the problem occurring, but the problem remains your CPU. WHEA 2, 17 & 19 errors (WHEA 19 is a dead giveaway you literally have bad CPU cores) are symptoms of 13th and 14th gen CPUs degrading. You may or may have these errors, to varying degrees, but they are symptom of the root cause. WHEA errors CAN and WILL be caused by overclocking as well. The difference: with a normal CPU, you just have an unstable overclock, with a degraded CPU these errors can occur even if you've never overclocked the processor or done anything wrong... i.e., 13 and 14th gen cpus. Good news is with the latest microcode update AND a brand new CPU run under Intel default power profile, you should not have any more issues.


PS, it sounds as if your CPU is not that degraded, if at all. However, you should try this one program:

https://benchmate.org/

It is a collection of different benchmarking tools that you can use to test the stability of your processor. Particularly if you run only one program, you need to run "Cinebench R15 Extreme"! Run 5 non-stop back-to-back runs (if your CPUs is even stable enough to survive one run) and if it passes, then you can save time not having to run other tools.

Intel's XTU (Extreme Tuning Utility) is also good and usually is what they will request you do if you open an RMA warranty claim. But I have seen CPUs that can pass that program and still fail with other critical applications. That's the thing about Raptor Lake degradation...is you need to thoroughly test the CPU because you MAY have a problem or have a mild one... or very obvious BSOD all-the-time issues.
0 Kudos
Eershaa
Beginner
3,224 Views

{disabling 'PCI Express Native Power Management and ASPM' functions and re-test to see if the errors continue } - I couldnt find the PCIE power Management option, but did found the Native ASPM which was already set to DISABLED.

 

I did updated the BIOS with sole purpose bcoz it was said that the new version had some microcode update and might help me with the WHEA errors. I have never OCed any of my PC component whether it be XMP, or CPU or GPU. 

 

I'm pretty sure that my CPU is kinda not that much degraded , there's never been any issue in the past 2 years, never blue screen or anything. 

 

Imo , this whole WHEA thing and this GPU thing probably started around the same time as one another. Regarding GPU, i feel like its not COIL WHINE rather its a faulty bad bearing in one of the GPU fans. 

 

I could use the benchmark if you think its necessary, but the thing you said, {But I'm willing to bet in your case, your error is referencing the PCIe Express connection to your GPU which runs through your CPU, not the chipset on the mobo.}...doesn't this confirms that WHEA and the GPU issues are related?? Coz in the back of my mind, i too have always felt that this whole thing started right around the same time last year.

0 Kudos
PC1997
New Contributor I
3,168 Views
If your cpu is not causing you any major issues and you're happy with the performance, you don't have to do anything. If you have any doubts, run some of those tests and it should pass with no problems.

What I'm trying to say is, because you have a 13900k, rather than chase gremlins, go down rabbit holes trying to fix phantom hardware issues, you need to test your cpu...because it is more likey the problems (minor in your case) are coming from your processor.

About WHEA (Windows Hardware Error Architecture) is not very good at telling you WHY there is a problem, its just telling you there IS a problem... In my case it kept referencing my GPU and two things made this go away:

1. Added more voltage to my CPU ("vmin shift") and the errors disappeared with a few 10s of millivolts more.

2. Replacing just the CPU, nothing else changed, also made the errors disappear.

The second option is the logical option, while the first option, although works for the time being, who knows for how long...is unrealistic when the CPU should just work. Again, for me, that's how I found out the connection between the CPU and the GPU errors occurring.

Your CPU supports 20 PCI Express lanes: 16 are reserved for your GPU(s) 1x16 or 2x8. And 4x lanes for NVMe storage. Additional storage lanes come from the chipset on your motherboard.

Because your cpu is handling directly communicating with some of your devices, without the need for a middle man chipset on the motherboard - this is where these errors are occurring because if your cpu is degraded, these are some of the symptoms it may show a long with a whole host of others..

Again, if you feel your cpu is getting the job done then you don't need to do anything. I'm just trying to save you and anybody else reading this time replacing hardware components, when if you have a 13th or 14th gen CPU, you might want to look there first.
0 Kudos
Eershaa
Beginner
3,144 Views

Ok so I ran the benchmate software. I downloaded it, installed it, Benchmate opened up, I selected Cinebench R15 Extreme. Now there were 2 options in that - CPU and CPU (single core).

 

I first ran back to back 2 tests for just CPU - gave me a rendering score of 1079 cb. Temp went all high upto 100C.

Then I ran CPU (single core) , it took some time - gave me score of 77 cb, with MP ratio 13.75 x.

 

then again I ran the just CPU one, back to back 5 times - gave me a score of 1065 cb at the end. Temp went all high again.

 

No noise or any other issue during the benchmarking. 

 

So is this good or bad ? How to proceed now?

0 Kudos
PC1997
New Contributor I
3,120 Views
Don't worry about the score, we are testing for stability and yours passed. You can run a few of the other benchmarks to mess around if you like or to seal the deal, run Intel's Extreme Tuning Utility. If you have any Unreal Engine 5 games (update your graphic driver) so when you go to play those games, the shaders will re-compile and you should be able to pass that no problems.

As to those whea errors... with your CPU being far less likely to be the corporate then you have to start looking at drivers, firmware updates, and very low chance, but possible hardware issues with the device ID listed in Windows Event Viewer (its in the logs, and check Device Manager to find out which device it's referring to). Personally, unless I was having an actual problem with the hardware I would not go replacing it just because I was seeing whea errors. If you happen to have a spare GPU, for example, you can always swap it out and see if the errors go away... I doubt your Gigabyte 4080 gpu has the onboard logic to report one of your fans going bad...and that's causing the errors - maybe if you unplug the cable it would generate that error, but even then it's not likely to report to Windows anything to do with the fans. If it did that would make sense that you're seeing those errors. Again, I don't think such error reporting capabilities exist on the card for that.

The important thing here is unless you're having a real-problem don't obsess with error messages in Windows Event Viewer. Probably 98% of the time, it's much to do about nothing, as it will report anything and everything, you could be in the middle of updating your drivers and if you check the exact time, it'll show some kind of error.. but there really isn't anything wrong at all.

In my case it was definitely caused by the CPU - for one I was getting literally tens of thousands of those errors in mere minutes... and my system was unstable until More Vcore voltage was added because I degraded the processor overclocking it excessively. I'm not saying that you oc'd your cpu.

My point being is that 13th/14th gen cpus that have degraded will definitely start to throw a bunch of whea errors (App crashes, BSOD etc). The thing is they are correctable errors, just one of those errors goes uncorrected and you will know about it in the form of a Blue Screen of Death.
0 Kudos
Eershaa
Beginner
3,090 Views

Ok. 

so i dont want anything to do with OCing with the Intel Extreme Utility, right now. And since I changed the slot of the SSD drive, the errors frequency dropped down to increasing by just 1 every time i turn on my PC as compared to it previously being at least 5-7 during my any gameplay run. Never actually used Intel ETU. 

And as far as i know all my drivers and firmwares are up to date, as i just update them as soon as anything new is released. I searched this link and couldn't even find anything that says 7ABA.

I guess there's always gonna be some error in the Windows, i should pay no mind to it.

The whole GPU fan disaster , made me question everything about the current condition of the PC, after all it aint that frequent that one just keep buying and customizing new PCs.

 

But seriously, THANK YOU for all the effort you put in. 

0 Kudos
Reply