Community
cancel
Showing results for
Did you mean:
New Contributor III
56 Views

## Least squares - estimation of fit error

I am using routines DGEQPF, DORMQR and DTRSV (according to example in dgeqpfx.f) and it works well.

I also need to calculate estimation of errors of fitted parameters but I did not find any example.

Can someone give me advice how to calculate it?

1 Solution
Black Belt
56 Views

The topic you ask about involves a combination of linear algebra, linear least squares and statistics, but it is not difficult to put the pieces together.

Let c0 be the "regression coefficients" or "fitted parameters" that you have found, and let R be the upper triangular matrix obtained from the Q-R factorization of the regressor matrix. The "confidence interval" for c, the true but unknown value of the regression coefficients, is given by c0δc < cc0 + δc, where

δc =  tν,α/2 [diag (RTR)-1Σ (y - yf)2/ν]1/2,

t is the Student-t variable, with tail area α/2 and degrees-of-freedom ν = n - m of the Student-t PDF,

n = number of observations,

m = number of regression coefficients,

y, yf are the observed and fitted values of the dependent variable.

5 Replies
Black Belt
57 Views

The topic you ask about involves a combination of linear algebra, linear least squares and statistics, but it is not difficult to put the pieces together.

Let c0 be the "regression coefficients" or "fitted parameters" that you have found, and let R be the upper triangular matrix obtained from the Q-R factorization of the regressor matrix. The "confidence interval" for c, the true but unknown value of the regression coefficients, is given by c0δc < cc0 + δc, where

δc =  tν,α/2 [diag (RTR)-1Σ (y - yf)2/ν]1/2,

t is the Student-t variable, with tail area α/2 and degrees-of-freedom ν = n - m of the Student-t PDF,

n = number of observations,

m = number of regression coefficients,

y, yf are the observed and fitted values of the dependent variable.

New Contributor III
56 Views

Thank You very much. I did not found this equation for QR factorisation.

I have one another question. Routine DGEQPF permutes columns of orignial matrix. Then I assume that the values \delta_c will be permuted so. Is it true?

Black Belt
56 Views

Whether pivoting is used or not is a detail at the programming level, and depends on which Lapack routine is used for performing the QR decomposition.

If you use DGEQRF, pivoting is not used and the question goes away.

Please note that DGEQPF is now deprecated; use DGEQP3 instead. If pivoting is used, the permutation vector does need to be used in reporting δc. However, only the diagonal of RTR needs to be permuted, so doing so is not difficult, although it can appear tricky to do so at first look.

New Contributor III
56 Views

Thanks for advice, I will try.

New Contributor III
56 Views

I have finished my routine calculating LSQ fit and also standard deviations of fitted parameters. It is not well memory optimised, but seems to work.

	subroutine DSOLVEQRErrorEstim_MKL(A,B,X,Xerr,m,n,tol,info)
implicit none
integer*4,intent(in) :: m,n		! m - number of equations, n - number of unknowns
real*8,intent(in) :: A(m,n)		! matrix A  (A*X=B)
real*8,intent(inout) :: B(m,1)	! righthand side
real*8,intent(out) :: X(n,1)		! fitted parameters
real*8, intent(out) :: Xerr(n,1)	! standard deviations of fitted parameters
real*8, intent(in) :: tol			!	Choose TOL to reflect the relative accuracy of the input data
integer*4, intent(inout) :: info  ! result of MKL routines 0 if all is ok

INTEGER          MMAX, NMAX, LDA, LDB, LDX, NRHMAX, LWORK
real*8,parameter ::  ZERO=0.0D0
INTEGER          I, IFAIL, J, K, NRHS
real*8	       TAU(N),WORK(64*n)
DOUBLE PRECISION RWORK(2*N)
INTEGER          JPVT(N),ipiv(n)
CHARACTER        CLABS(1), RLABS(1)
real*8			 R(n,n)  ! factorised matrix R
real*8			Acopy(m,n),Bcopy(m)  ! copy of input matrices
real*8		Bout(m)
real*8		err,sigma
lda=m
ldb=m
nrhs=1
lwork=64*n

! get a copy of input matrices to evaluate deviations
Acopy=A
Bcopy(1:m)=B(1:m,1)

!        Initialize JPVT to be zero so that all columns are free
JPVT=0

!        Compute the QR factorization of A
CALL DGEQP3(M,N,A,LDA,JPVT,TAU,WORK,lwork,INFO)
if (info/=0) then
return
endif

! copy R for evaluation of deviations
R=0.D0
do i=1,n	! row
R(1:i,i)=A(1:i,i)
enddo
call dtrmm('Left','Upper','Transpose','Not diagonal',n,n,1.D0,A,m,R,n)
! in R is not R^T*R

!        Determine which columns of R to use
DO 20 K = 1, N
IF (ABS(A(K,K)).LE.TOL*ABS(A(1,1))) GO TO 40
20    CONTINUE

!        Compute C = (Q**H)*B, storing the result in B
40 K = K - 1
CALL DORMQR('Left','Transpose',M,NRHS,N,A,LDA,TAU,B,LDB,WORK,LWORK,INFO)
if (info/=0) then
return
endif

!        Compute least-squares solution by backsubstitution in R*B = C
CALL DTRSV('Upper','No transpose','Non-Unit',K,A,LDA,B(1,1),1)

!           Set the unused elements of the I-th solution vector to zero
B(K+1:N,1)=ZERO
60    CONTINUE
!        Unscramble the least-squares solution stored in B
DO 100 I = 1, N
X(JPVT(I),1) = B(I,1)
100		CONTINUE

! now estimate errors

! we need inverse of R

! DGETRF computes an LU factorization of a general M-by-N matrix A
! using partial pivoting with row interchanges.
call DGETRF(n, n, R, n, ipiv, info)
if (info /= 0) then
!stop 'Matrix is numerically singular!'
return
end if

! DGETRI computes the inverse of a matrix using the LU factorization
! computed by DGETRF.
call DGETRI(n, R, n, ipiv, work, n, info)
if (info/=0) then
return
endif
! in R is now (R^T*R)^(-1)

! evaluate A*X
call DGEMM('None','None',m,1,n,1.D0,Acopy,m,X,n,0.D0,Bout,m)
! calculate error of fit
Bout=Bout-Bcopy
err=dot_product(Bout,Bout)
sigma=dsqrt(err/(m-n))
!        Unscramble the R
DO I = 1, N
Xerr(JPVT(I),1) = DSQRT(R(I,I))*sigma
enddo
return
end subroutine