My understanding of hyper threading from the Nehalem micro architecture onwards is that that two threads are accomodated per CPU core by the 'Dead' CPU cycles which take place when there is a CPU stall being used to run the second thread along with extra registers used for storing the context of the second thread per core. Assuming my understanding is correct, I'm guessing that applications with short transactions that nullify the CPU cache by accessing a working set that is greater that the L3 in size randomly will benefit from this the most.
Thanks in anticipation of whoever responds to this.
At one time it was said that applications which suffered from high ITLB (instruction translation lookaside buffer) miss rates were likely to benefit from hyperthreading. I think HT remains popular for data base usage which might be characterized as short transactions. It's not obvious that the active code footprint would be too large for L3, particularly as increasing number of threads ought to take advantage of L3 sharing.
Sandy Bridge was a post-Nehalem model which had fairly large latencies for divide and square root (if not given the no-prec-div treatment) and could benefit from hyperthreading, I don't know of an association with short transactions in that case.