Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.
441 Discussions

numpy.linalg.eigh fails for large matrices

MJR
Beginner
3,077 Views

I find that numpy.linalg.eigh fails for large matrices, returning nonsense, and no error, immediately. On Ubuntu's python it works fine, if slowly. As an example:

 

import numpy as np

import numpy.linalg as npl

n=2**16

r=np.random.random((n,n))

w,v=npl.eigh(r)

 

fails, whereas n=2**15 works. Note that you will need quite a lot of memory to run this -- 256GB might suffice, but 384GB would be better. The more general numpy.linalg.eig works. Try 2**13 and note that eigh takes several seconds to run. For 2**16 it would be expected to take around 500 times as long (i.e. hours), not instantaneous! w ends up containing nonsense tiny values, mostly zero, and they are not even ordered.

 

Problem observed with both Python 3.9.16 (main, Jun 15 2023, 02:33:25) and Python 3.6.8 |Intel Corporation| (default, Jan 15 2019, 04:34:13).

0 Kudos
4 Replies
AthiraM_Intel
Moderator
3,037 Views

Hi,


Thank you for posting in Intel Communities.


We ran the program with Intel Python 3.9.16. We could see that the time taken to run the program of 2**16 is more compared to 2**15 and that is expected. 

We ran the program on Intel DevCloud for oneAPI (Ubuntu 20.04.6).


Can you upgrade to the latest version of Intel python and perform your tests.


If you see the same issue with the latest Intel Python version, please share the below details:


  1. OS and Hardware details
  2. Expected and actual results
  3. Python version after upgrading



Thanks



0 Kudos
MJR
Beginner
2,973 Views

Thanks for your suggestions. I have just tried

 

import numpy as np
import numpy.linalg as npl
import time

start_time=time.time()
n=2**16
r=np.random.random((n,n))
setup_time=time.time()
print("Setup time ",setup_time-start_time)
w,v=npl.eigh(r)
end_time=time.time()
print("Diagonalisation time ",end_time-setup_time)

 

With n=2**12 I get

 

Setup time 0.22078585624694824
Diagonalisation time 1.914717674255371

 

but with n=2**16 I get

 

Setup time 53.627641916275024
Diagonalisation time 0.0010466575622558594

 

which cannot be right. I would expect something around an hour or so for the diagonalisation. 'python --version' says

 

Python 3.9.16 :: Intel Corporation

 

which I believe is the latest, the OS is Ubuntu 22.04.1LTS, and the CPU two Xeon Gold 6130. The machine has 192GB, so memory is not an issue. The results on an old dual CPU E5-2660 v3 are similar.

 

If I use Ubuntu's python (with OpenBLAS), then the setup time with n=2**16 is slightly slower at 65s, and the diagonalisation time is more credible (I got bored and stopped it after it had taken one core-hour).

 

 

0 Kudos
AthiraM_Intel
Moderator
2,897 Views

Hi,


Thank you for sharing the details. We are checking on this internally, will get back to you soon.



Thanks


0 Kudos
PengHuang
Employee
2,462 Views

Hi,

 

we suggested to use the scipy.linalg.eigh() to work around this issue, and we tested it with 2**16 matrics in our lab and results is as below, although it have performance penalty problem.

root cause have been found and fixing is under investigating.

thanks.

 

Setup time 53.75876045227051

[ -147.69128483 -147.68194538 -147.64563126 ...  147.65570254

  147.73823905 32767.76070977]

[[ 0.00626784 -0.00164428 0.00365509 ... -0.0053086  0.00305

 -0.00392455]

 [-0.00037944 -0.00017431 -0.00150512 ... 0.00136626 -0.00402141

 -0.00390795]

 [-0.00215462 -0.00214957 0.00446365 ... 0.00082956 -0.00353334

 -0.00390896]

 ...

 [ 0.01007743 -0.00272315 0.0032907 ... 0.00482207 0.00136076

 -0.00390707]

 [ 0.00386407 0.00561695 -0.00165864 ... -0.00906698 0.00055503

 -0.00389805]

 [-0.00045475 0.00196264 0.00091735 ... 0.00080511 0.00602332

 -0.00389506]]

Diagonalisation time 9520.18740272522

0 Kudos
Reply