As part of a complicated software problem on a production server, we planned to use vtune as a sort of debugging tool. First we run some tests of various systems, all the system are identical.
Test system A: the first time we started the nehalem general exploration (using the command line, attaching it to the running process) the linux kernel crashed. It was impossible to ssh to the machine and on the server console we could only see 0xfff... dumps. Rebooting was the only option. Assuming it was a parameter problem, we started it again and the second time it worked fine. Test system B: worked fine Test system C: worked fine Production: crashed, the same kernel crash up as stated above.
All the systems are identical (or they should be...): Red Hat 4.1.2-46, 2x Xeon 5650 hexacore HT enabled, 12GB memory The process is a database application running ~150 threads, but only using 100-300% cpu. The load on the test systems is less than in production.
We did not succeed in reproducing the error on our test systems, and trial/error on a production machine is unfortunatly not possible.