I'm about to select a small cluster of Xeons for a calculation intensive application.
I estimated that I need around 100 cores in order to have a calculation done in time.For this amount of cores my hardware supplier suggest buying 4 dual socket systems, filled with 12 or 14 core Xeons.
I have a bit of space left in my rack. It seems to me that I can also purchase 6 dual socket systems, each filled with 8 or 10 core Xeons.
When looking at the specs I can get 8, 10, 12, 14 core Xeons, each with a max bandwidth of 76.8 GB/s (9.6GT/s).
Suppose all cores are working in an openmp team (group of threads) and need approximately the same bandwidth and amount of data.
My question is:
Is the bandwidth per core approximately (max bandwidth) / (number of cores)?
If this is true, then a bandwidth sensitive operation would benefit from having less cores per cpu, right?
I can imagine that the max bandwidth is number belonging to the memory controller. You may need a minimum of cores to saturate, in that case: what is the sweetspot?
Edit by: Martijn Sanderse. Clarified the question a bit.
as far as i know [which could be wrong] you will not be able to jump between individual servers using only openMP, furthermore, even within one server things will need to be configured correctly [eg pci riser, etc...] to utilize all of the processors as a native single node. to run in parallel on your proposed setup you might need something like MPI