topic DSS - Memory problem in IntelĀ® oneAPI Math Kernel Library
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770487#M575
<P>Hello thereI tried solving Ax = b with A being a symmetric matrix of size N =5,26,338; with number of nonzeros being 20,787,787.The matrix is represented in CSR format. Thus the memory reqd to represent the linear system being: Memory(Values) = 158.6 MB Memory(Columns) = 79.3 MB Memory(rowindex) = 2.0 MB Memory(RHS) = 2.0 MB Total Memory = 241.91 MBI used the following options in creating the dss handle:MKL_INT opt = MKL_DSS_DEFAULTS; MKL_INT sym = MKL_DSS_SYMMETRIC; MKL_INT type = MKL_DSS_INDEFINITE; MKL_INT opt_parallel = MKL_DSS_METIS_OPENMP_ORDER;To my surprise, I found that the memory consumption (using Windows Task Manager + VS2010 Debugger) for the following two steps were out of proportion:1. Rerorder step(dss_reorder)required 0.63 GB of memory2. Factorization(dss_factor_real)requird 4.38 GB of memoryOn calling dss_delete, I recovered 6.39 GB. So, it is quite clear that reordering and factorization takes all the memory.I am not sure why this much of memory is being used up considering the fact that the matrix that is being factored is only about 240 MB.However, as far as I understand, the LU factorization should not require more than 2*240 = 480 MB (for this problem). Am I right?Although, am able to solve the system in a machine with 8 GB RAM, the solver fails in a 32 bit machine with 4 GB RAM. So, how do we make the solver work on low-end machines?Looking forward for your response at the earliest,Thanks & RegardsVaidy</P>Thu, 09 Aug 2012 06:08:47 GMTvaidyt2012-08-09T06:08:47ZDSS - Memory problem
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770487#M575
<P>Hello thereI tried solving Ax = b with A being a symmetric matrix of size N =5,26,338; with number of nonzeros being 20,787,787.The matrix is represented in CSR format. Thus the memory reqd to represent the linear system being: Memory(Values) = 158.6 MB Memory(Columns) = 79.3 MB Memory(rowindex) = 2.0 MB Memory(RHS) = 2.0 MB Total Memory = 241.91 MBI used the following options in creating the dss handle:MKL_INT opt = MKL_DSS_DEFAULTS; MKL_INT sym = MKL_DSS_SYMMETRIC; MKL_INT type = MKL_DSS_INDEFINITE; MKL_INT opt_parallel = MKL_DSS_METIS_OPENMP_ORDER;To my surprise, I found that the memory consumption (using Windows Task Manager + VS2010 Debugger) for the following two steps were out of proportion:1. Rerorder step(dss_reorder)required 0.63 GB of memory2. Factorization(dss_factor_real)requird 4.38 GB of memoryOn calling dss_delete, I recovered 6.39 GB. So, it is quite clear that reordering and factorization takes all the memory.I am not sure why this much of memory is being used up considering the fact that the matrix that is being factored is only about 240 MB.However, as far as I understand, the LU factorization should not require more than 2*240 = 480 MB (for this problem). Am I right?Although, am able to solve the system in a machine with 8 GB RAM, the solver fails in a 32 bit machine with 4 GB RAM. So, how do we make the solver work on low-end machines?Looking forward for your response at the earliest,Thanks & RegardsVaidy</P>Thu, 09 Aug 2012 06:08:47 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770487#M575vaidyt2012-08-09T06:08:47ZDSS - Memory problem
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770488#M576
<P>Hi Vaidy, </P><P>For the sparse matrix, after the LU factorization, in many cases, sparse matrix LU decomposition is pretty dense, and the memory requirement should largely increase. So for such problem size, N=500K, the size could be update to : 500K*500K*size of(double or float depending on the precision used)/2(matrix is symmetric) =1TB. 6G looks fine here. </P><P>If the memory is enough for problem, you can use the in-core functions. If the memory is not enough for you, you can use out-of-core solvers </P><P><BR />Thanks,<BR />Chao </P>Fri, 10 Aug 2012 06:47:54 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770488#M576Chao_Y_Intel2012-08-10T06:47:54ZDSS - Memory problem
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770489#M577
Hello Chao<DIV></DIV><DIV>Thanks a lot... I also thought thro' this and came to the same conclusion.. but,</DIV><DIV></DIV><DIV>Couple of more questions:</DIV><DIV>1. how do we estimate the size of L and U factors before solving it - so as to decide whether to go for in-core or variable/out of core?Do you know any kind of thumbrule that one could use to estimate the size of L and U based on the sparsity of the original matrix (A in Ax = b)?</DIV><DIV></DIV><DIV>2. Also, how do we decide (in runtime) whether to use direct or iterative solver based on the type of the system?</DIV><DIV></DIV><DIV>3. When using in-core, when there is not enough memory, the code simply crashes in LU factorization step. I have surrounded the code with try catch - but, still the application crashes (without throwing any exception). So, how do we handle this then?</DIV><DIV></DIV><DIV></DIV><DIV>Thanks & Regards</DIV><DIV>Vaidy</DIV><DIV></DIV>Fri, 10 Aug 2012 08:51:36 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770489#M577vaidyt2012-08-10T08:51:36ZDSS - Memory problem
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770490#M578
<P>Vaidy, </P><P>A few more comments from our experts for the questions: </P><P>1)In the DSS, there is one statistics function, which can report the memory usage information: </P><P> dss_statistics(handle, opt, statArr, retValues)</P><P>2)there is no simple answer for this question : for DSS and pardiso in MKL, it is is a direct solver and it cannot switch into iterative mode although it can do iterative refinement. If you are taking other options, direct solvers use more memory and provide more reliable answer, and iterative solvers deso not use that much of the memory, but their answer are somewhat less reliable. Often users have to decide which one is good for his problem beforehand. </P><P>3) PARDISO returns error code if it cannot allocate memory. C++ code should read the error code and throw an appropriate exception. PARDISO itself does NOT throw exceptions as it is C code rather than C++.</P><P>Thanks,<BR />Chao</P>Fri, 17 Aug 2012 05:18:27 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/DSS-Memory-problem/m-p/770490#M578Chao_Y_Intel2012-08-17T05:18:27Z