- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Hi all,
I have a quick question about (I think) the ALU implementation of the cores.
Does it make any difference in terms of performance if in a vector multiplication one vector is composed by all zero values?
The question arises from my attempt to implement a work stealing algorithm for a dense matrix multiplication.
I made a few tests and it seems that there's no difference at all, but maybe I have to specify something?
Every suggestion is always welcome.
Regards,
Luca
Enlace copiado
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Luca,
I talked to some engineers closer to the hardware than I. They do not believe there are any HW optimizations to address such special cases.
Just from a silicon footprint standpoint, this makes sense to me. (Special cases generally need more silicon to maintain performance.) KNL may be different (though I don't know).
Regards
---
Taylor

- Suscribirse a un feed RSS
- Marcar tema como nuevo
- Marcar tema como leído
- Flotar este Tema para el usuario actual
- Favorito
- Suscribir
- Página de impresión sencilla