Processors
Intel® Processors, Tools, and Utilities
15423 Discussions

XTU: Applying *any* change causes cold boot instabilities

TimurB
New Contributor I
3,496 Views

Hello.

On my undervolted Z790 + 13900K system I just reproduced over two dozens instant cold-boot crashes (no BSOD, sometimes clearing BIOS settings) starting a Cinebench 23 benchmark loop at realtime priority. This seems to be caused by XTU.

The instability is not (!) based on my
(BIOS) CPU settings, but on Intel XTU changing anything about them, even when slower values are used or when values are just set back and forth. I compared CPU values via HWinfo before and after hitting Apply in XTU after going back and forth on a value, but could not detect any differences. So whatever XTU does is rather hidden.

Based on the severity of the instabilities I assume something about voltages being messed up despite XTU claiming not to touch those when a VF curve is used in BIOS. It does not change the VF points, though, because these are still listed as unchanged in HWinfo

 
Doing a clean installation of an older XTU version did not help. I do not see voltage differences via HWinfo between before vs. after applying XTU changes. The cold reboot only happens using CB23 at Realtime priority, not Normal or High.
 
(PS: I only use XTU to workaround a BIOS bug where AVX ratios are set before Core ratios after extended shutdown and thus need two boot-ups every morning to be properly set based on non stock Core ratios.)
Labels (1)
0 Kudos
40 Replies
TimurB
New Contributor I
1,120 Views

Turns out that my supposed workaround of stopping/restarting the service isn't viable. When the XTU service is stopped/restarted then it doesn't apply its settings on next restart anymore (aka resets everything as if an error had occurred). So a real fix would be appreciated.

0 Kudos
Mike_Intel
Moderator
1,095 Views

Hello TimurB,


Thank you for patiently waiting for our update.


Base on our recent checking of the issue. It looks like because of workload and application type Cinebench is using it in real-time mode takes over almost all CPU time, having everything almost stop.

XTU service still need some of CPU time to perform changes and monitor situation.

So, as previously mentioned - we don't recommend using Cinebench or other benchmarks in real-time mode as it was mentioned in many articles.

 

Thank you so much for informing us about this situation and just take note on the usage of real-time priority for apps.


If you have questions, please let us know. Thank you.


Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
1,093 Views

Ok, I see that Intel has no interest in supporting their (overclocking) application properly and I am played for a fool.

Not only did I repeatedly mention that *restarting the XTU service fixes the crashes* despite running the *same workload*, I also told you the crashes happen *without using CB23 in realtime priority*! The latter is just an easy and 100% working way to reproduce the instabilities within 3 seconds instead of having to wait for them to happen randomly.

I also know that other users experience the same hard crashes with XTU and that the software seems to be rather unpopular in the overclocking community due to its lack of reliability.

 

Meanwhile I switched to Throttlestop to set my AVX ratios properly, something I only have to do in software because the (Gigabyte) BIOS does not set them properly to begin with. Another point where Intel had shown to put too little interest in what mainboard manufacturers where doing until 13/14th gen instabilities reached a level that could not be denied anymore.

 

Michael, thanks for looking into this for some time, but unfortunately your back-office left you hanging with no solution.

 

Best regards and try harder Intel!

0 Kudos
Mike_Intel
Moderator
1,046 Views

Hello TimurB,


Thank you for the update.


We understand your situation that is why we also would like to further investigate this issue.

We hope that you can still share the following details that our Engineering team would like to know.


  1. Please share a step by step guide or recorded reproduction/video of the issue.
  2. Please add screenshot of task manager CPU tab, so we can see CPU status before Cinebench will start.


If you have questions, please let us know. Thank you.


Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
1,018 Views

1. 

a) Change/increase the AVX offset in XTU (or any other parameter from what I can tell).

b) Click Apply in XTU.

c) Start Cinebench 23.

d) Set Cinebench 23' process priority to "Realtime" (!this is for easy reproduction of the error!).

e) Start Cinebench 23' "Multi-core" benchmark.

-> Hard "cold boot" crash, sometimes with "BIOS error" message (BIOS reset to defaults).

 

Repeat the same steps, but this time *restart* (or stop) the XtuOcDriverService before step e).

-> No crash.


2. 

TimurB_0-1720179784621.png

 

It's noteworthy that this also happens when I use either full stock BIOS settings or stock + Intel "stability specs" settings (AC LL = DC LL, 125 PL1, 253 W PL2, 307 A ICCmax, TVB 70°C).

 

Have a nice weekend!

0 Kudos
Mike_Intel
Moderator
978 Views

Hello TimurB,


Thank you for the information provided.  


We will do further research on this matter and post the response on this thread once it is available.

Have a fantastic day, and thank you very much for your patience and understanding!



Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
VonM_Intel
Moderator
903 Views

Hello TimurB,

The crash is likely due to Watchdog Timer. When present (active), the timer starts after any XTU tuning, and when it hangs, it causes a safety reset (crash). This is especially true when Cinebench is set to "real-time" priority. Originally Watchdog will show up Failed as False. After running Cinebench on “Realtime” priority, the system will crash as the Watchdog timer fails to communicate. The Watchdog timer will stay activated and require periodic (every 10-15 seconds) "ping" from XTU service to know that the system did not hang or become unstable. This is done to help the user if the system ever freezes: then it will reset itself.

 

One way to "disarm" the watchdog timer is to stop the XTU service (gracefully). Which is what the user is already doing as a workaround. The system behaves as it is designed with safety in mind in case of an overload/freeze. Watchdog status in XTU can be visible in "system information" at the bottom of the parameters list. Furthermore, I'd like to ask you to share the logs with us when experiencing a crash without Cinebench set to realtime, but with Cinebench set to default OS to better understand the issue.

Kindly compress the entire "Log" folder and attach it to this thread case for further investigation that we will continue.

 

Moreover, we'd like to ask if we can now close this thread.

 

Best regards,

Von M.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
887 Views

Hello Von M.,

 

so the Watchdog timer essentially cannot be disabled and there is no hint to the user that this may intentionally cause hard (cold boot + BIOS error/reset) crashes? Let's just say that this seems not ideal to neither inform the users about it nor offers an option to disable the function. It took 6 weeks to even mentions this vital piece of information, but thanks for that, it's still appreciated!

 

Stopping/Restarting the XTU service "gracefully" is not really a workaround, because then XTU refuses to load the last settings after the next reboot. So for my personal use-case I will use Throttlestop now to properly set AVX offsets. Maybe Intel can give Gigabyte a hint that setting AVX ratios *before* core ratios in the BIOS boot order is a bad idea!?

 

Reproducing the same kind of XTU crashes during normal operation is too sporadic and random to spend more time on. So, yes, you can close this thread now.

 

Thanks and best regards

Timur

0 Kudos
Mike_Intel
Moderator
854 Views

Hello TimurB,


Thank you for the update.


Watchdog service is Windows safety solution and XTU use it as a failsafe if anything happens during tuning. More about Watchdog service: https://learn.microsoft.com/en-us/mdep/architecture/core-os/watchdog-service


If user would like to disable it, then he's doing it on his own risk.

XTU will always load a "default" setting, not a saved profile after system crash https://www.intel.com/content/www/us/en/support/articles/000095316/processors.html


If you have questions, please let us know. Thank you.


Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
814 Views

What exactly does the Watchdog failsafe against, though? The user not finding the power/reset button? Is it really more safe to bombard the BIOS back into a clear CMOS state rather than allowing a halfway graceful Blue Screen with saved mini dump? I am not convinced, but at least I have the information to make educated decisions now.

0 Kudos
Mike_Intel
Moderator
764 Views

Hello TimurB,


Thank you for the update.


We understand your concern, we can only point you to Microsoft documentation since they own it and developer only use it based on how it works.

 

Sometimes it's not that obvious that system crashed.

This solution might be also very useful for a headless/unattended/remote operating machines when process block system that much, that only such action can recover system.


If you have questions, please let us know. Thank you.


Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
Mike_Intel
Moderator
729 Views

Hello TimurB,

 

I hope this message finds you well.


Just posting a follow up to check if you still have clarifications.

 

If you have questions, please let us know. Thank you.

 

Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
716 Views

Hi Michael,

 

I still wished this was optional as it even intercepts Blue Screens and resets the BIOS (maybe even corrupt it). And with it sporadically happening under non realtime high load it turns more into a stability issue than stability feature.

For me personally this is solved now, though, by not using XTU for changing AVX ratios anymore. And when I happen to do any temporary (not bootable) changes via XTU I make sure to restart the service.

 

Thanks again and have a nice weekend!

0 Kudos
Mike_Intel
Moderator
646 Views

Hello TimurB,

 

I hope this message finds you well.


The watchdog timer driver does not reset BIOS settings. It is designed to reset the system in case of a failure to ensure that the system can attempt to recover from an error state. If BIOS settings are being reset to their defaults without user intervention, it is likely due to another issue and I recommend checking with system manufacturer. The watchdog timer is designed to assist users in recovering from system freezes by ensuring that certain settings do not carry over after a reboot. This precaution helps to prevent a cycle where unstable configurations are reapplied at startup, causing the system to reset repeatedly and potentially creating an infinite loop of reboots.

 

If you have questions, please let us know. Thank you.

 

Best regards,

Michael L.

Intel Customer Support Technician


0 Kudos
RandyT_Intel
Moderator
556 Views

Hello TimurB,


I wanted to follow up to see if you had a chance to look over the information posted. Your feedback at your earliest convenience would be greatly appreciated so we can move forward with resolving this matter.



Randy T.

Intel Customer Support Technician


0 Kudos
TimurB
New Contributor I
538 Views

Hello Randy and Michael,

 

thanks for the follow-up. I understand the intention behind the feature and it's certainly useful. But it should be optional and more clearly propagated. No one knows about the feature - and it took 2 weeks to clarify it here - and not everyone isn't able to find the Clear CMOS function of their BIOS when needed.

 

Now I have to restart the XTU service every time I use XTU for changes, just to make sure that the Watchdog doesn't unintentionally crash-reset my system when I test something. It's an additional source of instability now and I usually even prefer a Blue Screen over its behavior.

 

Fortunately my own undervolting stability tests are finished now, so I won't really need any of this anymore. My CPU is stable via power, current and TVB limits even in XTU's very transients heavy and nicely demanding Linpack based stress-test now.

 

Best regards,

Timur

0 Kudos
NormanS_Intel
Moderator
449 Views

Hello TimurB,

 

Thank you for your response and for your understanding. I will share your feedback with our team. If you have any further questions or need additional assistance, please feel free to reach out. Furthermore, I will provide an update in this thread as soon as it becomes available.

 

Best regards,

Norman S.

Intel Customer Support Engineer

 

0 Kudos
NormanS_Intel
Moderator
378 Views

Hello TimurB,


After discussing this issue internally, we would like to inform you that Intel is using the Watchdog service (owned by Microsoft) as intended to prevent the OS from remaining in a crashed or frozen state. It took some time to gather detailed information about XTU, as it is not something we handle on a daily basis.


If XTU is not suitable for your testing needs, it can be uninstalled, and you can use direct BIOS settings without any issues. However, please note that the Watchdog service will continue to monitor the system unless it is disabled during the test period.


If you have no further questions, I will proceed to close this inquiry. I will be awaiting your response, as this thread will no longer be monitored once the inquiry is closed.


Best regards,

Norman S.

Intel Customer Support Engineer


TimurB
New Contributor I
277 Views

Thanks for the information. You can close the thread then.

0 Kudos
NormanS_Intel
Moderator
320 Views

Hello TimurB,


I have not heard back from you so I will close this inquiry now. If you need further assistance, please submit a new question as this thread will no longer be monitored.


Best regards,

Norman S.

Intel Customer Support Engineer


Reply