Congratulations to Intel CPU instruction set engineers for managing to make YET ANOTHER non-orthogonal instruction set extension -- why PEXTRD/PINSRD (among many others) were not promoted to 256 bits in AVX2?
Any ideas/tricks to work around this engineering "oversight"?
链接已复制
63 回复数
Prefetching distance can be directly related to the data needed for the computation.The problem is to find how far ahead prefetch the data.Prefetching too far can saturate the bus and as you pointed it out can cause issue with another thread is competing for L1 data cache.For particle diffusion program prefetching distance of one particle object at least could be sufficient(of course the issue of memory data layout of such objects should be taken into account ).
