Re: Re:Using fgmres to solve large-scale linear equation systems

houjy · ‎09-17-2024

I have used ILP64 successfully.

But I have another question,When I use fgmres to solve large systems of linear equations, some values in the ipar array are equal to -5589219598436180572

This is my code example. I would like to try solving a system of linear equations with a scale of 69335390 or above. Thank you very much.

Ruqiu_C_Intel · ‎09-22-2024

Thank you for contacting us for help. It's helpful if you can provide us a simple reproducer, compiler, compile options, OS, and CPU information..

Regards,

Ruqiu

houjy · ‎09-23-2024

Thank you very much!

reproducer：

visual studio2019(release x64)，win 10

compiler，compile options：Optimization:/O2L;Preprocessor Definitions:NDEBUG
_CONSOLE;Warning Level:/W3

OS:windows

CPU information:CPU 为Intel(R) Xeon(R) Platinum 8269CY,1023G

I don't know if the information I provided is comprehensive.

Mahan · ‎09-24-2024

Hello houjy,

Could you please provide a sample code to reproduce this issue.

houjy · ‎09-25-2024

谢谢。我已经在之前的回复中上传了我的代码。我还需要提供任何进一步的细节吗

houjy · ‎09-25-2024

Thank you very much. I have already uploaded my code in my previous reply. Is there any further detail that I need to provide

Mahan · ‎09-25-2024

Hi houjy,

Got it, thanks.

Please allow us a few days to get back to you.

Mahan · ‎10-02-2024

Hello houjy,

The following dependency is missing while compiling the code

"fmm.hpp"

Please include that with the reproducer.

Also, does it require an Eigen library?

houjy · ‎10-07-2024

Thank you very much. The function of "fmm. hpp" is to generate a system of linear equations Ax=b, and it does not use the engin library.
Because the entire project file is a huge amount of code, I rewrote the code for randomly generating a system of linear equations.
After each iteration of the code, there is a problem with RCI_request=4 for the second time. Could you please help me find where the problem lies. This problem only occurs when calculating a scale of billions.

houjy · ‎10-07-2024

#include <cstdio>
#include <vector>
#include <mkl.h>
#include <mkl_spblas.h>
#include <random>
#include <iostream>

using std::vector;
using namespace std;

int main() {
// 设置问题规模N（上亿）
MKL_INT N = 100000000; // 1亿
MKL_INT M =350; // GMRES的restart频率

// 为随机数生成器设定种子
std::mt19937 gen(1500);
std::uniform_real_distribution<> dist(-1.0, 1.0);

// 创建稀疏矩阵A，使用CSR格式
// 假设A每行最多有10个非零元素（具体数值视问题而定）
vector<MKL_INT> ia(N + 1, 0); // 行指针初始化为0
vector<MKL_INT> ja; // 列索引
vector<double> a; // 非零值

// 随机生成稀疏矩阵
for (MKL_INT i = 0; i < N; ++i) {
ia[i] = static_cast<MKL_INT>(ja.size()) + 1; // 记录当前行的起始位置
for (MKL_INT j = 0; j < 10; ++j) {
MKL_INT col_idx = i + j; // 为了简单起见，假设对角线附近有非零值
if (col_idx >= N) break; // 防止越界
ja.push_back(col_idx + 1); // 列索引从1开始
a.push_back(dist(gen)); // 随机生成非零元素
}
}
ia[N] = static_cast<MKL_INT>(ja.size()) + 1; // 记录最后一行的结束位置

// 随机生成右端项b
vector<double> b(N);
for (MKL_INT i = 0; i < N; ++i) {
b[i] = dist(gen);
}

// 初始化GMRES求解器
MKL_INT ipar[128] = { 0 };
double dpar[128] = { 0 };
MKL_INT fan = N * (2 * M + 1) + (M * (M + 9)) / 2 + 1;
cout << "值" << N * (2 * M + 1) + (M * (M + 9)) / 2 + 1 << endl;
double* tmp = new double[70100062826];
vector<double> computed_solution(N, 0.0); // 初始解为0

MKL_INT RCI_request = 0;
MKL_INT itercount = 0;

DFGMRES_INIT(&N, computed_solution.data(), b.data(), &RCI_request, ipar, dpar, tmp);
if (RCI_request != 0) {
std::cerr << "Initialization failed" << std::endl;
return -1;
}

// 设置GMRES参数
ipar[14] = M; // GMRES重启频率
ipar[7] = 1; // 启动最大迭代次数
ipar[8] = 1; // 启动残差检测
ipar[10] = 1; // 启动预处理

// 主循环，执行FGMRES求解
while (true) {
DFGMRES(&N, computed_solution.data(), b.data(), &RCI_request, ipar, dpar, tmp);
itercount++; // 增加迭代次数
if (RCI_request == 0) break; // 解得出结果

if (RCI_request == 1) {
// 计算 A * tmp[ipar[21] - 1] 并存储在 tmp[ipar[22] - 1]
mkl_dcsrgemv("N", &N, a.data(), ia.data(), ja.data(), &tmp[ipar[21] - 1], &tmp[ipar[22] - 1]);
}
else if (RCI_request == 3) {
// 应用预处理器（可选，可以自定义预处理器）
for (MKL_INT i = 0; i < N; ++i) {
tmp[ipar[22] - 1 + i] = tmp[ipar[21] - 1 + i]; // 此处为单位预处理器
}
}
else if (RCI_request == 2) {
std::cout << "Iteration " << itercount << ": Residual = " << dpar[4] << std::endl;
if (dpar[4] < dpar[0]) break; // 满足残差要求
}
}

// 完成后输出结果
std::cout << "FGMRES completed in " << itercount << " iterations." << std::endl;

// 清理内存
delete[] tmp;

return 0;
}

Mahan · ‎10-06-2024

Hello houjy,

Please provide the necessary dependencies pointed out in the previous message.

Mahan · ‎10-08-2024

houjy

The reproducer has quite a bit of memory leakage; please look into that.

You can find out more about the memory leakage by using the following command to compile the code

icpx -O0 -g -traceback -fsanitize=address -qmkl test_fgmres.cpp

houjy · ‎10-08-2024

Because we are using a different compilation environment, may I ask if the memory overflow you mentioned is after calling the fgmers function:
DFGMRES(&N, computed_solution.data(), b.data(), &RCI_request, ipar, dpar, tmp);

Mahan · ‎10-09-2024

houjy

The memory overflow is happening during the call

DFGMRES_INIT(&N, computed_solution.data(), b.data(), &RCI_request, ipar, dpar, tmp);

at line number 55.

houjy · ‎10-09-2024

Thank you very much. Unfortunately, I just wanted to seek help to solve the problem with line 55. I found that there is no memory overflow when the generated matrix size is small, but memory overflow occurs when the code size is over one billion

Mahan · ‎10-13-2024

Hi houjy

Thanks for the confirmation.

Does your system have enough memory to handle large matrices for which you are facing this memory overflow issue?

houjy · ‎10-13-2024

Thank you for your help
My computer has enough memory. My computer has 1TB of memory, and this code requires approximately 13GB of memory

Mahan · ‎10-17-2024

Hi houjy

This issue has been communicated to the developers, allow us a few days to get back to you.

Mahan · ‎10-20-2024

Hi houjy

In the reproducer the matrix dimension and the size of the Klyrov subspace:

MKL_INT N = 100000000; // the matrix dimension

MKL_INT M =350; // the size of Krylov subspace

It is easy to check that M*N= 35 000 000 000 exceeds INT_MAX = 2 147 483 647 for 32-bit Integers.

Could you please reduce the N size so that it does not exceed the INT_MAX and check?

houjy · ‎10-21-2024

Thank you very much

But I am using 64 bit and ILP64, max MKL_INT=9223372036854775807

Mahan · ‎10-21-2024

Could you please try compiling with the following

-DMKL_ILP64 -qmkl-ilp64=parallel