Re: I am interested in Thunder cluster's hpl efficiency

Nan_Q_Intel · ‎07-01-2004

From I know, IPF2 one cpu can get 92% efficiency, I am amazing the thunder gained 86% for 4096 number of cpu!

Does QsNet give so much good performance? I used QsNet1, it is no such good performance.

And another factor is tiger4 is not as good as tiger2 for its sharemem.

which expert could answer such question?

thank you very much

thanks

-Stan

Ken_C_Intel · ‎07-01-2004

I do not know what is hpl, and how todefine the efficiency of a computer?

What is the benchmark to use? LINPACK?

Thanks.

Henry_G_Intel · ‎07-01-2004

Hello kchen3,

HPL stands for High-Performance Linpack. It's a widely-used HPC benchmark available free from Netlib. The Top500 list of most powerful HPC systems uses HPL as its metric.

Parallelefficiency = T1 / (TN * NP) where T1 is the runtime on one processor, TN is the runtime on N processors, and NP is the number of processors. For example, an application thattakes 100 seconds to execute in serial but only 10 seconds to execute in parallel on 10 processors achieves 100% parallel efficiency. An application thattakes 100 seconds to execute in serial but 50 seconds to execute in parallel on 10 processors achieves only20% parallel efficiency.

Best regards,

Henry

Nan_Q_Intel · ‎07-02-2004

Yes, Thanks Henry for your wonderful explaination.

Do you know how it reach so high efficiency?

thanks

Stan

Henry_G_Intel · ‎07-02-2004

Hi Stan,

Does QsNet give such good performance? According to the Top500HPL data,LLNL Thunder (currently #2) achieves20 TFLOPS and 87% parallel efficiency. So, the Elan4 interconnect must be that good, at least for HPL.

The Chinese Academy of Sciences Itanium cluster (currently #26), whichis similar to Thunder, achieves 79% efficiency. I'm not absolutely sure but I think the CAS cluster uses Elan3. Can you verify? Also, how much memory/node does the CAS cluster have?

Best regards,

Henry

Nan_Q_Intel · ‎07-05-2004

Hi Henry,

Yes CAS cluster is using Elan3 and QsNet1. And Thunder is using Elan4 and QsNet2, that may be the different efficiency's reason.CAS's each node has 8GB memory.

Thanks

Stan

Henry_G_Intel · ‎07-06-2004

Hi Stan,

Elan4 has better bandwidth and latency than Elan3,which would explainwhy LLNL Thunderis getting better parallel efficiency thanthe CAS cluster.

Henry

rwilkins · ‎07-08-2004

Linear speed-up and scale-up alone is inadequate.

Economies of scale ?

There must be more advantages to "Bigness" than linear performance improvement.

What other features of scale does your design exploit. Higher capacity utilizaion ? Decreasing cost per unit throgh automation of
maintenance functions ?

ClayB · ‎07-09-2004

rwilkins@usinter.net wrote:
Linear speed-up and scale-up alone is inadequate.

rwilkens -

In some cases, these two items are enough. Sure, the pencil pushers and accountants are going to be looking at price-performance. However, the scientists that use the equipment will be looking just at performance. From my experience, if you can provide a machine that will allow their application to run twice as fast (or with a data set twice the size) when you double the number of processors, you will be their hero. Assuming that they have applications or data sets that can fill the larger machine, which really isn't as uncommon as that may sound.

So, CAS may be envious of the Thunder efficiency and look to be upgrading their hardware in order to achieve a similar measure. That is, if there is a need for the better efficiency from the users of the system. If the system is serving the needs of the clients at CAS, there would be no need to invest any resources to improve the already pretty high efficiency.

--clay