- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
forgive me if the question sounds naive.
I am interested in configuring a server system. I understand that the x200 xeon phi can be used as a standalone CPU and provide up to 72 cores and 288 threads. I am wondering if it is realistic to build a single x200 CPU server and use it to run many processes in parallel, as if running them on a small clusters?
more intuitively, if I run 200 jobs (even load) on this x200 processor (base clock 1.3GHz), assuming I have sufficient memory, will it take comparable time to complete compared to running half of the jobs (N=100) on a dual-xeon E5-2658v3 (48 threads total, 2.2GHz base clock) server? is this comparison valid?
by the way, are there dual-LGA3647 mobo to use two x200 processors?
thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Xeon Phi excels (only) when running well parallellized and vectorized code. Turning your KNL box into a 200 jobslot cluster is not likely to give you very good performance, although it is definitely possible to do this.
Keep in mind that the 72 Xeon Phi cores are sharing the same cache (and no L3 cache) and are actually Atom-based cores, not Xeon cores: some instructions will take a lot longer to process on the Xeon Phi vs a "regular" Xeon.
As for a dual socket Xeon Phi: Intel has repeatedly told me that there never will be a dual-socket Xeon Phi. Perhaps they've changed their mind on this, but I would not bet on any development in this direction in the near future.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are too many factors (differences) involved than clock rates to extrapolate comparative run times.
From my experience with comparing KNL 7210 system (1.3GHz) to older Xeon E5-2620v2 (2.1GHz), performing compute intensive SCALAR code (e.g. the finalization phase of VTune), the gut feel performance difference feels close to be the E5-2620v2 is 10x faster than the SCALAR performance of Xeon Phi 7210.
*** DO NOT TAKE THE ABOVE OUT OF CONTEXT ***
Code that can take advantage of the 512-bit vectors and can take advantage of the streaming capability of the memory sub-system (or large numbers of L2 cache) make the KNL much more superior.
If you have both systems available, I suggest you run your actual program on both. It may be of interest of others on this forum to hear the results (of your actual program).
As a side note about actual program verses "standard benchmark" is most standard benchmarks, as well as most application experiences, will show that using 2 or 3 threads per core is better than using 4 threads per core (on KNL). Actual difference for your process may contradict this. I have been tuning a process that uses both MPI and OpenMP, which is compute intensive but does not (as) effectively use the 512-bit vectors (e.g. it processes a large number of small (6x6) matrix operations). This process is one of the few that runs best using 4 threads per core.
>>are there dual-LGA3647 mobo to use two x200 processors?
The CPU (KNL) would also have to support MP communication (e.g. QPI or successor). The newer CPUs, e.g. Xeon Platinum 8180, support 28 cores/56 threads, have AVX-512, and operate at 2.5-3.8 GHz, and have 6 memory channels (DDR4-2666). These can be configured in 1, 2, 4, 8 socket SMP motherboard configurations (with 6, 12, 24, 48 memory channels).
Cost to purchase and cost to operate may need to be factored into the calculus as to which system(s) is(are) better.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JJK and @jimdempseyatthecove, thank you both for the comments, these are exactly what I wanted to find out. sounds like KNL is really designed for special programs, and does not have the scalability for general programs.
regarding dual socket boards, I found a couple from amazon, like this one
https://www.amazon.com/Supermicro-X11DAI-N-Dual-Sockets-Motherboard/dp/B074HTGHDG
not sure if those are compatible with KNL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
regarding dual socket boards, I found a couple from amazon, like this one
https://www.amazon.com/Supermicro-X11DAI-N-Dual-Sockets-Motherboard/dp/B...
not sure if those are compatible with KNL.
They are not. It is impossible to build a dual-socket KNL system, since it has no UPI links (which are the interconnect used between sockets in a multi-socket system).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>KNL does not have the scalability for general programs
Wrong.
KNL does have good scalability...
... from the perspective of (relative to) the base-level performance of a single thread on the same KNL.
When utilizing high thread count mostly scalar code, the aggregate performance per socket per $, it is hard to beat that of the KNL. IOW for highly parallelizable code and/or highly distributable code, KNL may offer the best $/performance (at least for now with the current pricing levels of the Intel Scalable Processor series).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim
for 3d rendering could the KNL be better choice than the new xeon scalable?
Can you give a recommandation for motherboard , maybe with dual socket KNL?
Thanks
Clement Thomas
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page