Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
1058 Discussions

What is the latency and throughput of the vbroadcastsd instruction?

jeremyweek
Beginner
156 Views
Hi,

I was wondering what is the latency and throughput of the vbroadcastsd instruction? (This is
for Sandy Bridge)

I did not find that information in the Optimization Reference Manual.

Thanks!

-Jeremy
0 Kudos
1 Reply
capens__nicolas
New Contributor I
156 Views
According to Agner Fog's instruction tables it has a 1 / cycle throughput, like every other shuffle type instruction. The latency is mainly determined by the memory access. A load from L1 cache takes 4 cycles, and the broadcast itself should take 1 cycle. There might also be a penalty of 1 or 2 cycles for crossing domains (I don't remember if that applies here). In any case this is a fast instruction for what it's designed to do.
Reply