Big Ideas
See how hardware, software, and innovation come together.
60 Discussions

Reliability at Scale: Latest Intel® Data Center Diagnostic Tool Updates

1 0 10.5K

Author: Zane Ball

I recently wrote about reliability at scale on our 4th Generation Intel® Xeon® processors outlining the new ways the reliability of a fleet can be managed for months and even years after installation through sophisticated remote debug tools. The Intel® Data Center Diagnostic Tool is one of the most effective tools in our portfolio.

Initially released in July 2021, various versions of the tool have since been deployed on millions of servers around the world. Designed to be part of a regular system maintenance program, it scans for wear and tear events that occur over the life of a processor and can indicate if a processor has deteriorated. And it can also be helpful during system set up, burn-in environments and out-for-repair flows, revealing issues such as intermittent contact or insufficient cooling by way of its high CPU usage. Our initial version of the tool was designed to work on Linux, but as customers began to experience more of its usefulness, a top feature request was to provide a Microsoft Windows version.

Today I am excited to share that the Intel Data Center Diagnostic Tool now runs on both Linux and Windows and supports Intel Xeon branded server and workstation processors launched since 2016. We aim with this broader processor and operating system support to expand usability for a wide variety of data center systems. This enables administrators and managers of private and hybrid-cloud data centers to deploy a tool specifically designed to help maintain their server fleet health and detect issues as processors age.

With the Intel Data Center Diagnostic Tool, customers can now run in their environment many of the same tests Intel uses during manufacturing, enhancing fleet maintenance and the ability to diagnose and resolve issues remotely at the statistical margins of a large fleet.


About the author

Zane Ball, Corporate VP & GM Datacenter Engineering and ArchitectureZane Ball, Corporate VP & GM Datacenter Engineering and ArchitectureDr. Zane A. Ball is a Corporate Vice President and General Manager of the Data Center Platform Engineering & Architecture (DPEA) group. DPEA owns end-to-end engineering for Intel’s data center business and is responsible for designing and validating the latest data center platforms and enabling Intel’s customers to ramp and deploy platforms at scale.

Prior to his data center role, Ball was Co-GM of Intel’s foundry effort as a VP in the Technology and Manufacturing group. Ball has also served as a VP of the Client Computing Group including roles as GM of the desktop client business and as GM of global customer engineering.

Ball has a bachelor’s degree, master’s degree, and Ph.D. in electrical engineering, all earned from Rice University.  He holds six patents in high-speed electrical design.  You can connect with him on Linkedin and Twitter