- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a simple question but in fact I haven't found any information about it which satisfies me. Jack Dongarra writes that this is the maximum performance defined but processor manufacturer which cannot be exceeded.
So, how peak performance is calculated, e.g P4 --> 2xclock. Is it just a number the execution units (FMUL and FADD) which can simultaneously process data in one cycle or some other factors should be also taken into account? MKL or GOTO implementation can achieve ~85% efficiency while Fujitsu BLAS goes even 93%. I was wondering if >100% is possible. Just kidding :smileywink:.
Many thanks in advance! Best wishes,
Maciej Nawrocki
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Peak performance depends on the processor and the function and where the data is. For matrix multiplication on double precision (DGEMM) the peak performance is the number of FP operations per second. For Pentium 4 processors using SSE instructions, the peak performance is 2 times the clock as a double precision multiply or add can be done each clock. On the Itanium processor, the peak rate is 4 times the clock since on each clock up to two FMA operations can be done, with each FMA being a multiply-add.
Of course on the Pentium 4 processor in single precision rate is twice the double precision rate.
Operations such a FFTs are more problematic in that there is not a balance between mutliplies and adds. Generally the number of operations is taken as 5*N*log*(N), but that is just a normalized number which does not necessarily represent the number of FP operations.
Vector operations such as dot product may have a peak performance similar to that for dgemm, but unless the data is in cache, the limits will be defined by the memory bandwidth rather than by the FP capabilities of the processor.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page