I've noticed that when I use the xORMLQ (as part of an LQ solution of an underdetermined system), the xORMLQ function is accessing parts of the right-hand-side matrix that I do not think it should be. I've attached a small test case (double precision, real) exhibiting the behavior.
The test case attempts to solve the matrix A*x = b where A is underdetermined. In the example, A is 30 by 36. After the LQ factorization (which seems correct), you solve the problem as x = Q^T * inv(L) * b where b is of length 30 and x is of length 36. In the test case, x and b are the same array with total length 36. The step inv(L)*b is performed using a call to TRSM and the result is correct (of length 30). The application of Q^T is performed with a call to ORMLQ. In this case the input vector (called 'C' in the function) is of length 30 and the output is of length 36. However, the ORMLQ actually is dependent on the input values of the array C(31:36). If you do not pre-zero these values, the total result is incorrect. If you do pre-zero these values, the result is correct. For the non-blocked code path, the actual issue is in a GEMV called by DLARF (called by ORMLQ). This GEMV multiply is including these extra values of the C-array. In line 100-101 of the test code, you can toggle between zeroing the extra entries or stuffing them with garbage values.
Nothing in the documentation indicates that you need to zero these unused values on input to the ORMLQ function. To me it seems like undesired behavior to require the user to pre-zero these values. I feel that LAPACK should zero them if necessary in preparation for the GEMV calls (which might actually need to access them as C is computed). Or, the documentation needs to change to indicate that the user is required to pre-zero these extra array entries.
I ran the test case on Windows 7 with MKL 2019.0.2 and compile line of 'ifort /Qmkl lqbug.f90'. Note that this issue also exists in the stock version of LAPACK available from netlib.