Skip to article frontmatterSkip to article content

LU factorization

A major tool in numerical linear algebra is to factor a given matrix into terms that are individually easier to deal with than the original. In this section we derive a means to express a square matrix using triangular factors, which will allow us to solve a linear system using forward and backward substitution.

2.4.1Outer products

Our derivation of the factorization hinges on an expression of matrix products in terms of vector outer products. If uRm\mathbf{u}\in\real^m and vRn\mathbf{v}\in\real^n, then the outer product of these vectors is the m×nm\times n matrix

uvT=[u1v1u1v2u1vnu2v1u2v2u2vnumv1umv2umvn].\mathbf{u} \mathbf{v}^T = \begin{bmatrix} u_1 v_1 & u_1 v_2 & \cdots & u_1 v_n \\u_2 v_1 & u_2 v_2 & \cdots & u_2 v_n \\ \vdots & \vdots & & \vdots \\ u_m v_1 & u_m v_2 & \cdots & u_m v_n \end{bmatrix}.

We illustrate the connection of outer products to matrix multiplication by a small example.

It is not hard to derive the following generalization of Example 2.4.1 to all matrix products.

2.4.2Triangular product

Equation (2.4.4) has some interesting structure for the product LU\mathbf{L}\mathbf{U}, where L\mathbf{L} is n×nn\times n and lower triangular (i.e., zero above the main diagonal) and U\mathbf{U} is n×nn\times n and upper triangular (zero below the diagonal).

Let the columns of L\mathbf{L} be written as k\boldsymbol{\ell}_k and the rows of U\mathbf{U} be written as ukT\mathbf{u}_k^T. Then the first row of LU\mathbf{L}\mathbf{U} is

e1Tk=1nkukT=k=1n(e1Tk)ukT=L11u1T.\mathbf{e}_1^T \sum_{k=1}^n \boldsymbol{ℓ}_k \mathbf{u}_k^T = \sum_{k=1}^n (\mathbf{e}_1^T \boldsymbol{\ell}_k) \mathbf{u}_k^T = L_{11} \mathbf{u}_1^T.

Likewise, the first column of LU\mathbf{L}\mathbf{U} is

(k=1nkukT)e1=k=1nk(ukTe1)=U111.\left( \sum_{k=1}^n \mathbf{ℓ}_k \mathbf{u}_k^T\right) \mathbf{e}_1 = \sum_{k=1}^n \mathbf{\ell}_k (\mathbf{u}_k^T \mathbf{e}_1) = U_{11}\boldsymbol{\ell}_1.

These two calculations are enough to derive one of the most important algorithms in scientific computing.

2.4.3Triangular factorization

Our goal is to factor a given n×nn\times n matrix A\mathbf{A} as the triangular product A=LU\mathbf{A}=\mathbf{L}\mathbf{U}. It turns out that we have n2+nn^2+n total nonzero unknowns in the two triangular matrices, so we choose nn of them arbitrarily as follows.

We will require that L\mathbf{L} be a unit lower triangular matrix.

We have arrived at the linchpin of solving linear systems.

The outer product algorithm for LU factorization seen in Example 2.4.3 is coded as Function 2.4.1.

2.4.4Gaussian elimination and linear systems

In your first matrix algebra course, you probably learned a triangularization technique called Gaussian elimination or row elimination to solve a linear system Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}. In most presentations, you form an augmented matrix [A  b][\mathbf{A}\;\mathbf{b}] and do row operations until the system reaches an upper triangular form, followed by backward substitution. LU factorization is equivalent to Gaussian elimination in which no row swaps are performed, and the elimination procedure produces the factors if you keep track of the row multipliers appropriately.

Like Gaussian elimination, the primary use of LU factorization is to solve a linear system. It reduces a given linear system to two triangular ones. From this, solving Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} follows immediately from associativity:

b=Ax=(LU)x=L(Ux).\mathbf{b} = \mathbf{A} \mathbf{x} = (\mathbf{L} \mathbf{U}) \mathbf{x} = \mathbf{L} (\mathbf{U} \mathbf{x}).

Defining z=Ux\mathbf{z} = \mathbf{U} \mathbf{x} leads to the following.

A key advantage of the factorization point of view is that it depends only on the matrix A\mathbf{A}. If systems are to be solved for a single A\mathbf{A} but multiple different versions of b\mathbf{b}, then the factorization approach is more efficient, as we’ll see in Efficiency of matrix computations.

As noted in the descriptions of Function 2.4.1 and Algorithm 2.4.2, the LU factorization as we have seen it so far is not stable for all matrices. In fact, it does not always even exist. The missing element is the row swapping allowed in Gaussian elimination. We will address these issues in Row pivoting.

2.4.5Exercises