from numpy import *
from scipy import linalg
from scipy.linalg import norm
from matplotlib.pyplot import *
from prettytable import PrettyTable
from timeit import default_timer as timer
import sys
sys.path.append('fncbook/')
import fncbook as FNC
# This (optional) block is for improving the display of plots.
# from IPython.display import set_matplotlib_formats
# set_matplotlib_formats("svg","pdf")
# %config InlineBackend.figure_format = 'svg'
rcParams["figure.figsize"] = [7, 4]
rcParams["lines.linewidth"] = 2
rcParams["lines.markersize"] = 4
rcParams['animation.html'] = "jshtml" # or try "html5"
As in the rest of this chapter, we will be using the 2-norm exclusively.
Equation (3.3.2) is the key to the computational attractiveness of orthogonality. Figure 3.3.1 shows how nonorthogonal vectors can allow a multidimensional version of subtractive cancellation, in which ∥x−y∥ is much smaller than ∥x∥ and ∥y∥. As the figure illustrates, orthogonal vectors do not allow this phenomenon. By (3.3.2), the magnitude of a vector difference or sum is larger than the magnitudes of the original vectors.
Figure 3.3.1:Nonorthogonal vectors can cause cancellation when subtracted, but orthogonal vectors never do.
Statements about orthogonal vectors are often made more easily in matrix form. Let Q be an n×k matrix whose columns q1,…,qk are orthogonal vectors. The orthogonality conditions (3.3.1) become simply that QTQ is a diagonal matrix, since
If the columns of Q are orthonormal, then QTQ is the k×k identity matrix. This is such an important property that we will break with common practice here and give this type of matrix a name.
Now we come to another important way to factor a matrix: the QR factorization. As we will show below, the QR factorization plays a role in linear least squares analogous to the role of LU factorization in linear systems.
In most introductory books on linear algebra, the QR factorization is derived through a process known as Gram–Schmidt orthogonalization. However, while it is an important tool for theoretical work, the Gram–Schmidt process is numerically unstable. We will consider an alternative construction in Computing QR factorizations.
When m is much larger than n, which is often the case, there is a compressed form of the factorization that is more efficient. In the product
In order to have the normal equations be well posed, we require that A is not rank-deficient (as proved in Theorem 3.2.2). This is enough to guarantee that R^ is nonsingular (see Exercise 3.3.4). Therefore, its transpose is nonsingular as well, and we arrive at
This algorithm is implemented in Function 3.3.2. It is essentially the algorithm used internally by Julia when A\b is called. The execution time is dominated by the factorization, the most common method for which is described in Computing QR factorizations.
The solution of least-squares problems via QR factorization is more stable than when the normal equations are formulated and solved directly.
Confusingly, a square matrix whose columns are orthogonal is not necessarily an orthogonal matrix; the columns must be orthonormal, which is a stricter condition.