- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
My initial experiments offloading FFTs into the MIC (using C language in Linux) give me approximately 9.3 GFLOPS of performance, judging by the reported [MIC Time] numbers when I set the environment variable OFFLOAD_REPORT to 1 (or 2). This is about 0.46% of the advertized peak performance of 2 TFLOPS. But in fact, it is much less than that if I take into account the time for the data movement inside the offload section [CPU Time in the "report").
Am I missing something?
I am curious to know if my numbers are way off or consistent with other benchmarks (I could not find any).
I would appreciate it if someone could point me to related information or to know if someone had a different (or similar) experience.
The bottom line is that I hope I need to do something to drastically improve its performance, but I ran out of ideas. Any help will be appreciated.
Thanks!
Fernando
Link copiado
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Hi Fernando,
we are also working on that and would like to exchane experience. Could you post your benchmark code? There are a number of different FFT problems (1D is very different from multi-dimensional, small transforms are a very different animal from large transforms), a huge range of options of how to run it (number of threads, pinning, batch transforms), and of finer details (alignment, first touch allocation, user threads, size of offloaded batches, etc.). It is impossible to pinpoint the problem without seeing all the details.
Also, you can't expect 2 TFLOP/S for FFTs. 2 TFLOP/s is theoretical peak performance of the fused multiply-add operation (FMA). FFTs are not bottlenecked by FMA, they are a memory bandwidth-bound problem, so you should expect a lot less than the theoretical peak performance.
A

- Subscrever fonte RSS
- Marcar tópico como novo
- Marcar tópico como lido
- Flutuar este Tópico para o utilizador atual
- Marcador
- Subscrever
- Página amigável para impressora