Sorry for stupid question but I cant found exhaustive answer for this
If I right understand SkyLike CPU can two memory reads (on ports 2 and 3) and one memory write (on port 4) each cycle.
If I right understand work with memory is the performance wall for most applications.
Why Intel or somebody else do not do memory accelerators (for example there one shared non coherent fast memory for many CPUs or may be else architecture) ?
Why Intel or somebody else do not do more then 2 memory read and one memory write ? Why can not do hundreds reads/writes per cycle ?
Only speedup RAM wold give speedup for many times for many applications.
Link Copied
For more complete information about compiler optimizations, see our Optimization Notice.