There is significant performance slow-down when I run a simple zgemv program with multiple instances on a machine.
If running a.exe take x time, when I run 4 a.exe at the same time on this machine, each takes 2x+ time. This is on Linux, multi-core (12), large memory machine. The matrix size is about 1400x1400.
What I can do to improve the performance with multiple instance run here?