Skip to article frontmatterSkip to article content

Exploiting matrix structure

A common situation in computation is that a problem has certain properties or structure that can be used to get a faster or more accurate solution. There are many properties of a matrix that can affect LU factorization. Here is an easy one to check.

Diagonal dominance has a computationally useful implication.

We next consider three other types of matrices that cause the LU factorization to be specialized in some important way.

2.9.1Banded matrices

If row pivoting is not used, the L\mathbf{L} and U\mathbf{U} factors preserve the lower and upper bandwidths of A\mathbf{A}. This fact implies computational savings in both the factorization and the triangular substitutions, because the zeros appear predictably and we can skip multiplication and addition with them.

In order to exploit the savings offered by sparsity, we would need to make modifications to Function 2.4.1 and the triangular substitution routines. Alternatively, we can get Julia to take advantage of the structure automatically by converting the matrix into a special type called sparse. Sparse matrices are covered in more detail in Chapters 7 and 8.

2.9.2Symmetric matrices

Recall from Definition 2.2.2 that a symmetric matrix is a square matrix A\mathbf{A} satisfying AT=A\mathbf{A}^T = \mathbf{A}. Symmetric matrices arise frequently in applications because many types of interactions, such as gravitation and social-network befriending, are inherently symmetric. Symmetry in linear algebra simplifies many properties and algorithms. As a rule of thumb, if your matrix has symmetry, you want to exploit and preserve it.

In A=LU\mathbf{A}=\mathbf{L}\mathbf{U} we arbitrarily required the diagonal elements of L\mathbf{L}, but not U\mathbf{U}, to be one. That breaks symmetry, so we need to modify the goal to

A=LDLT,\mathbf{A}=\mathbf{L}\mathbf{D}\mathbf{L}^T,

where L\mathbf{L} is unit lower triangular and D\mathbf{D} is diagonal. To find an algorithm for this factorization, we begin by generalizing (2.4.4) a bit without furnishing proof.

Let’s derive the LDLT factorization for a small example.

In practice, we don’t actually have to carry out any arithmetic in the upper triangle of A\mathbf{A} as we work, since the operations are always the mirror image of those in the lower triangle. As a result, it can be shown that LDLT factorization takes about half as much work as the standard LU.

Just as pivoting is necessary to stabilize LU factorization, the LDLT factorization without pivoting may be unstable or even fail to exist. We won’t go into the details, because our interest is in specializing the factorization to matrices that also possess another important property.

2.9.3Symmetric positive definite matrices

Suppose that A\mathbf{A} is n×nn\times n and xRn\mathbf{x}\in\mathbb{R}^n. Observe that xTAx\mathbf{x}^T\mathbf{A}\mathbf{x} is the product of 1×n1\times n, n×nn\times n, and n×1n\times 1 matrices, so it is a scalar, sometimes referred to as a quadratic form. It can be expressed as

xTAx=i=1nj=1nAijxixj. \mathbf{x}^T\mathbf{A}\mathbf{x} = \sum_{i=1}^n \sum_{j=1}^n A_{ij}x_ix_j.

The definiteness property is usually difficult to check directly from the definition. There are some equivalent conditions, though. For instance, a symmetric matrix is positive definite if and only if its eigenvalues are all real positive numbers. SPD matrices have important properties and appear in applications in which the definiteness is known for theoretical reasons.

Let us consider what definiteness means to the LDLT factorization. We compute

0<xTAx=xTLDLTx=zTDz, 0 < \mathbf{x}^T\mathbf{A}\mathbf{x} = \mathbf{x}^T \mathbf{L} \mathbf{D} \mathbf{L}^T \mathbf{x} = \mathbf{z}^T \mathbf{D} \mathbf{z},

where z=LTx\mathbf{z}=\mathbf{L}^T \mathbf{x}. Note that since L\mathbf{L} is unit lower triangular, it is nonsingular, so x=LTz\mathbf{x}=\mathbf{L}^{-T}\mathbf{z}. By taking z=ek\mathbf{z}=\mathbf{e}_k for k=1,,nk=1,\ldots,n, we can read the equalities from right to left and conclude that Dkk>0D_{kk}>0 for all kk. That permits us to write a kind of square root formula:[1]

D=[D11D22Dnn]=[D11D22Dnn]2=(D1/2)2. \mathbf{D} = \begin{bmatrix} D_{11} & & & \\ & D_{22} & & \\ & & \ddots & \\ & & & D_{nn} \end{bmatrix} = \begin{bmatrix} \sqrt{D_{11}} & & & \\ & \sqrt{D_{22}} & & \\ & & \ddots & \\ & & & \sqrt{D_{nn}} \end{bmatrix}^{\,2} = \bigl( \mathbf{D}^{1/2} \bigr)^2.

Now we have A=LD1/2D1/2LT=RTR\mathbf{A}=\mathbf{L}\mathbf{D}^{1/2}\mathbf{D}^{1/2}\mathbf{L}^T= \mathbf{R}^T \mathbf{R}, where R=D1/2LT\mathbf{R} =\mathbf{D}^{1/2}\mathbf{L}^T is an upper triangular matrix whose diagonal entries are positive.

While the unpivoted LDLT factorization is not stable and not even always possible, in the SPD case one can prove that pivoting is not necessary for the existence nor the stability of the Cholesky factorization.

The speed and stability of the Cholesky factorization make it the top choice for solving linear systems with SPD matrices. As a side benefit, the Cholesky algorithm fails (in the form of an imaginary square root or division by zero) if and only if the matrix A\mathbf{A} is not positive definite. This is often the best way to test the definiteness of a symmetric matrix about which nothing else is known.

2.9.4Exercises

Footnotes
  1. Except for this diagonal, positive definite case, it’s not trivial to define the square root of a matrix, so don’t generalize the notation used here.