Our OS: Linux 6.4 Santiago with 32 Core Servers and HT Enabled.
We are a financial trading software firm and during the performance optimization we observed
L1D Replacement % = 1.0
L2D Replacement % = 1.0
LLC Replacement % = 1.0
in our VTune output.
Our application is completely single threaded and we are making sure that the make application thread is always pinned to a core. Is it a possibility that because of HT enabled we see the above numbers because cores on the same processor share resources like L1D, DTLB, ITLB? Has anyone seen this kind of behavior before.
For trading applications where latency matters does switching HT off is advisable?
On the question of how HyperThreading affects partition of core resources, you might check the Intel Architecture manuals for information on the CPU family of interest to you.
Several of the important resources are shared dynamically between the pair of logical processors, so that a single thread can utilize all of them, provided that no other thread interrupts their use on that core. When I last studied the matter, ITLB was an exception, where a thread could use only half the entries for that core, even when HT is disabled. It was said that architects believed that most applications which would run best with HT disabled did not need the additional ITLB, due to (assumed) higher instruction locality than those for which HT was designed.
Documents on such matters for new CPUs tend to remain under non-disclosure for a while after the CPU is released to market.
It's probably possible to run specific tests to see how much of the resource a thread can use.