Linear systems - Fundamentals of Numerical Computation

We now attend to the central problem of this chapter: Given a square, $n\times n$ matrix $\mathbf{A}$ and an $n$ -vector $\mathbf{b}$ , find an $n$ -vector $\mathbf{x}$ such that $\mathbf{A}\mathbf{x}=\mathbf{b}$ . Writing out these equations, we obtain

\begin{split} A_{11}x_1 + A_{12}x_2 + \cdots + A_{1n}x_n &= b_1, \\ A_{21}x_1 + A_{22}x_2 + \cdots + A_{2n}x_n &= b_2, \\ \vdots \\ A_{n1}x_1 + A_{n2}x_2 + \cdots + A_{nn}x_n &= b_n. \end{split}

(2.3.1)

If $\mathbf{A}$ is invertible, then the mathematical expression of the solution is $\mathbf{x}=\mathbf{A}^{-1}\mathbf{b}$ because

\begin{split} \mathbf{A}^{-1}\mathbf{b} = \mathbf{A}^{-1} (\mathbf{A} \mathbf{x}) = (\mathbf{A}^{-1}\mathbf{A}) \mathbf{x} = \mathbf{I} \mathbf{x} = \mathbf{x}. \end{split}

(2.3.2)

When $\mathbf{A}$ is singular, then $\mathbf{A}\mathbf{x}=\mathbf{b}$ may have no solution or infinitely many solutions.

2.3.1Don’t use the inverse¶

Matrix inverses are indispensable for mathematical discussion and derivations. However, as you may remember from a linear algebra course, they are not trivial to compute from the entries of the original matrix. You might be surprised to learn that matrix inverses play almost no role in scientific computing.

In fact, when we encounter an expression such as $\mathbf{x} = \mathbf{A}^{-1} \mathbf{b}$ in computing, we interpret it as “solve the linear system $\mathbf{A} \mathbf{x} = \mathbf{b}$ ” and apply whatever algorithm is most expedient based on what we know about $\mathbf{A}$ .

Julia

MATLAB

Python

As demonstrated in Example 2.1.1, the backslash (the \ symbol, not to be confused with the slash / used in web addresses) invokes a linear system solution.

Example 2.3.2 (Solving linear systems)

Julia

MATLAB

Python

Example 2.3.2

For a square matrix $\mathbf{A}$ , the syntax A \ b is mathematically equivalent to $\mathbf{A}^{-1} \mathbf{b}$ .

A = [1 0 -1; 2 2 1; -1 -3 0]

3×3 Matrix{Int64}:
  1   0  -1
  2   2   1
 -1  -3   0

b = [1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

x = A \ b

3-element Vector{Float64}:
  2.1428571428571432
 -1.7142857142857144
  1.1428571428571428

One way to check the answer is to compute a quantity known as the residual. It is (ideally) close to machine precision (relative to the elements in the data).

residual = b - A * x

3-element Vector{Float64}:
 -4.440892098500626e-16
 -4.440892098500626e-16
  0.0

If the matrix $\mathbf{A}$ is singular, you may get an error.

A = [0 1; 0 0]
b = [1, -1]
x = A \ b    # throws an error

SingularException(2)

Stacktrace:
 [1] generic_trimatdiv!(C::Vector{Float64}, uploc::Char, isunitc::Char, tfun::typeof(identity), A::Matrix{Int64}, B::Vector{Int64})
   @ LinearAlgebra ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/triangular.jl:1388
 [2] _ldiv!
   @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/triangular.jl:966 [inlined]
 [3] ldiv!
   @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/triangular.jl:959 [inlined]
 [4] \
   @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/triangular.jl:1721 [inlined]
 [5] \(A::Matrix{Int64}, B::Vector{Int64})
   @ LinearAlgebra ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/generic.jl:1130
 [6] top-level scope
   @ In[41]:3

In this case we can check that the rank of $\mathbf{A}$ is less than its number of columns, indicating singularity.

rank(A)

1

A linear system with a singular matrix might have no solution or infinitely many solutions, but in either case, backslash will fail. Moreover, detecting singularity is a lot like checking whether two floating-point numbers are exactly equal: because of roundoff, it could be missed. In Conditioning of linear systems we’ll find a robust way to fully describe this situation.

Example 2.3.2

For a square matrix $\mathbf{A}$ , the syntax A \ b is mathematically equivalent to $\mathbf{A}^{-1} \mathbf{b}$ .

A = [1 0 -1; 2 2 1; -1 -3 0]

b = [1; 2; 3]

x = A \ b

One way to check the answer is to compute a quantity known as the residual. It is (ideally) close to machine precision (relative to the elements in the data).

residual = b - A*x

If the matrix $\mathbf{A}$ is singular, you may get a warning and nonsense result.

A = [0 1; 0 0]
b = [1; -1]
x = A \ b

In this case, we can check that the rank of $\mathbf{A}$ is less than its number of columns, indicating singularity.

rank(A)

Example 2.3.2

For a square matrix $A$ , the command solve(A, b) from scipy.linalg is mathematically equivalent to $\mathbf{A}^{-1} \mathbf{b}$ .

A = array([[1, 0, -1], [2, 2, 1], [-1, -3, 0]])
b = array([1, 2, 3])

from scipy import linalg
x = linalg.solve(A, b)
print(x)

[ 2.14285714 -1.71428571  1.14285714]

One way to check the answer is to compute a quantity known as the residual. It is (ideally) close to machine precision(relative to the elements in the data).

residual = b - A @ x
print(residual)

[-4.4408921e-16 -4.4408921e-16  0.0000000e+00]

If the matrix $\mathbf{A}$ is singular, you may get an error.

A = array([[0, 1], [0, 0]])
b = array([1, -1])
linalg.solve(A, b)    # error, singular matrix

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
Cell In[41], line 3
      1 A = array([[0, 1], [0, 0]])
      2 b = array([1, -1])
----> 3 linalg.solve(A, b)    # error, singular matrix

File ~/miniforge3/envs/myst/lib/python3.13/site-packages/scipy/linalg/_basic.py:325, in solve(a, b, lower, overwrite_a, overwrite_b, check_finite, assume_a, transposed)
    323 elif assume_a in {'lower triangular', 'upper triangular'}:
    324     lower = assume_a == 'lower triangular'
--> 325     x, info = _solve_triangular(a1, b1, lower=lower, overwrite_b=overwrite_b,
    326                                 trans=transposed)
    327     _solve_check(n, info)
    328     _trcon = get_lapack_funcs(('trcon'), (a1, b1))

File ~/miniforge3/envs/myst/lib/python3.13/site-packages/scipy/linalg/_basic.py:519, in _solve_triangular(a1, b1, trans, lower, unit_diagonal, overwrite_b)
    517     return x, info
    518 if info > 0:
--> 519     raise LinAlgError("singular matrix: resolution failed at diagonal %d" %
    520                       (info-1))
    521 raise ValueError('illegal value in %dth argument of internal trtrs' %
    522                  (-info))

LinAlgError: singular matrix: resolution failed at diagonal 0

A linear system with a singular matrix might have no solution or infinitely many solutions, but in either case, a numerical solution becomes trickier. Detecting singularity is a lot like checking whether two floating-point numbers are exactly equal: because of roundoff, it could be missed. We’re headed toward a more robust way to fully describe this situation.

2.3.2Triangular systems¶

The solution process is especially easy to demonstrate for a system with a triangular matrix. For example, consider the lower triangular system

\begin{bmatrix} 4 & 0 & 0 & 0 \\ 3 & -1 & 0 & 0 \\ -1 & 0 & 3 & 0 \\ 1 & -1 & -1 & 2 \end{bmatrix} \mathbf{x} = \begin{bmatrix} 8 \\ 5 \\ 0 \\ 1 \end{bmatrix}.

(2.3.5)

The first row of this system states simply that $4x_1=8$ , which is easily solved as $x_1=8/4=2$ . Now, the second row states that $3x_1-x_2=5$ . As $x_1$ is already known, it can be replaced to find that $x_2 = -(5-3\cdot 2)=1$ . Similarly, the third row gives $x_3=(0+1\cdot 2)/3 = 2/3$ , and the last row yields $x_4=(1-1\cdot 2 + 1\cdot 1 + 1\cdot 2/3)/2 = 1/3$ . Hence, the solution is

\mathbf{x} = \begin{bmatrix} 2 \\ 1 \\ 2/3 \\ 1/3 \end{bmatrix}.

(2.3.6)

The process just described is called forward substitution. In the $4\times 4$ lower triangular case of $\mathbf{L}\mathbf{x}=\mathbf{b}$ it leads to the formulas

\begin{split} x_1 &= \frac{b_1}{L_{11}}, \\ x_2 &= \frac{b_2 - L_{21}x_1}{L_{22}}, \\ x_3 &= \frac{b_3 - L_{31}x_1 - L_{32}x_2}{L_{33}}, \\ x_4 &= \frac{b_4 - L_{41}x_1 - L_{42}x_2 - L_{43}x_3}{L_{44}}. \end{split}

(2.3.7)

For upper triangular systems $\mathbf{U}\mathbf{x}=\mathbf{b}$ an analogous process of backward substitution begins by solving for the last component $x_n=b_n/U_{nn}$ and working backward. For the $4\times 4$ case we have

\begin{bmatrix} U_{11} & U_{12} & U_{13} & U_{14} \\ 0 & U_{22} & U_{23} & U_{24} \\ 0 & 0 & U_{33} & U_{34} \\ 0 & 0 & 0 & U_{44} \end{bmatrix} \mathbf{x} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \\ b_4 \end{bmatrix}.

(2.3.8)

Solving the system backward, starting with $x_4$ first and then proceeding in descending order, gives

\begin{split} x_4 &= \frac{b_4}{U_{44}}, \\ x_3 &= \frac{b_3 - U_{34}x_4}{U_{33}}, \\ x_2 &= \frac{b_2 - U_{23}x_3 - U_{24}x_4}{U_{22}}, \\ x_1 &= \frac{b_1 - U_{12}x_2 - U_{13}x_3 - U_{14}x_4}{U_{11}}. \end{split}

(2.3.9)

It should be clear that forward or backward substitution fails if and only if one of the diagonal entries of the system matrix is zero. We have essentially proved the following theorem.

2.3.3Implementation¶

Consider how to implement the sequential process implied by Equation (2.3.7). It seems clear that we want to loop through the elements of $\mathbf{x}$ in order. Within each iteration of that loop, we have an expression whose length depends on the iteration number. This leads to a nested loop structure.

Algorithm 2.3.1 (forwardsub)

Julia

MATLAB

Python

Forward substitution

forwardsub.jl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
"""
    forwardsub(L,b)

Solve the lower-triangular linear system with matrix `L` and
right-hand side vector `b`.
"""
function forwardsub(L, b)
    n = size(L, 1)
    x = zeros(n)
    x[1] = b[1] / L[1, 1]
    for i in 2:n
        s = sum(L[i, j] * x[j] for j in 1:i-1)
        x[i] = (b[i] - s) / L[i, i]
    end
    return x
end

Forward substitution

forwardsub.m

1
2
3
4
5
6
7
8
9
10
11
12
13
14
function x = forwardsub(L, b)
    % FORWARDSUB   Solve a lower triangular linear system.
    % Input:
    %   L    lower triangular square matrix (n by n)
    %   b    right-hand side vector (n by 1)   
    % Output:
    %   x    solution of Lx = b (n by 1 vector)

    n = length(L);
    x = zeros(n, 1);
    for i = 1:n
        x(i) = (b(i) - L(i, 1:i-1) * x(1:i-1)) / L(i, i);
    end
end

Forward substitution

forwardsub.py

1
2
3
4
5
6
7
8
9
10
11
12
13
def forwardsub(L,b):
    """
     forwardsub(L,b)

    Solve the lower-triangular linear system with matrix L and right-hand side
    vector b.
    """
    n = len(b)
    x = np.zeros(n)
    for i in range(n):
        s = L[i,:i] @ x[:i]
        x[i] = ( b[i] - s ) / L[i, i]
    return x

The implementation of backward substitution is much like forward substitution and is given in Function 2.3.2.

Algorithm 2.3.2 (backsub)

Julia

MATLAB

Python

Backward substitution

backsub.jl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
"""
    backsub(U,b)

Solve the upper-triangular linear system with matrix `U` and
right-hand side vector `b`.
"""
function backsub(U, b)
    n = size(U, 1)
    x = zeros(n)
    x[n] = b[n] / U[n, n]
    for i in n-1:-1:1
        s = sum(U[i, j] * x[j] for j in i+1:n)
        x[i] = (b[i] - s) / U[i, i]
    end
    return x
end

Backward substitution

backsub.m

1
2
3
4
5
6
7
8
9
10
11
12
13
14
function x = backsub(U, b)
    % BACKSUB   Solve an upper triangular linear system.
    % Input:
    %   U    upper triangular square matrix (n by n)
    %   b    right-hand side vector (n by 1)   
    % Output:
    %   x    solution of Ux = b (n by 1 vector)

    n = length(U);
    x = zeros(n, 1);
    for i = n:-1:1
        x(i) = ( b(i) - U(i, i+1:n) * x(i+1:n) ) / U(i, i);
    end
end

Backward substitution

backsub.py

1
2
3
4
5
6
7
8
9
10
11
12
13
def backsub(U,b):
    """
    backsub(U,b)

    Solve the upper-triangular linear system with matrix U and right-hand side
    vector b.
    """
    n = len(b)
    x = np.zeros(n)
    for i in range(n-1, -1, -1):
        s = U[i, i+1:] @ x[i+1:]
        x[i] = ( b[i] - s ) / U[i, i]
    return x

Example 2.3.3 (Triangular systems of equations)

Julia

MATLAB

Python

Example 2.3.3

It’s easy to get just the lower triangular part of any matrix using the tril function.

A = rand(1.:9., 5, 5)
L = tril(A)

5×5 Matrix{Float64}:
 8.0  0.0  0.0  0.0  0.0
 1.0  3.0  0.0  0.0  0.0
 4.0  3.0  5.0  0.0  0.0
 8.0  8.0  1.0  3.0  0.0
 9.0  4.0  6.0  5.0  9.0

We’ll set up and solve a linear system with this matrix.

b = ones(5)
x = FNC.forwardsub(L,b)

5-element Vector{Float64}:
  0.125
  0.2916666666666667
 -0.075
 -0.7527777777777778
  0.3246913580246913

It’s not clear how accurate this answer is. However, the residual should be zero or comparable to $\macheps$ .

b - L * x

5-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0
 2.220446049250313e-16

Next we’ll engineer a problem to which we know the exact answer. Use \alpha Tab and \beta Tab to get the Greek letters.

α = 0.3;
β = 2.2;
U = diagm( 0=>ones(5), 1=>[-1, -1, -1, -1] )
U[1, [4, 5]] = [ α - β, β ]
U

5×5 Matrix{Float64}:
 1.0  -1.0   0.0  -1.9   2.2
 0.0   1.0  -1.0   0.0   0.0
 0.0   0.0   1.0  -1.0   0.0
 0.0   0.0   0.0   1.0  -1.0
 0.0   0.0   0.0   0.0   1.0

x_exact = ones(5)
b = [α, 0, 0, 0, 1]

5-element Vector{Float64}:
 0.3
 0.0
 0.0
 0.0
 1.0

Now we use backward substitution to solve for $\mathbf{x}$ , and compare to the exact solution we know already.

x = FNC.backsub(U,b)
err = x - x_exact

5-element Vector{Float64}:
 2.220446049250313e-16
 0.0
 0.0
 0.0
 0.0

Everything seems OK here. But another example, with a different value for $\beta$ , is more troubling.

α = 0.3;
β = 1e12;
U = diagm( 0=>ones(5), 1=>[-1, -1, -1, -1] )
U[1, [4, 5]] = [ α - β, β ]
b = [α, 0, 0, 0, 1]

x = FNC.backsub(U,b)
err = x - x_exact

5-element Vector{Float64}:
 -4.882812499995559e-5
  0.0
  0.0
  0.0
  0.0

It’s not so good to get 4 digits of accuracy after starting with 16! The source of the error is not hard to track down. Solving for $x_1$ performs $(\alpha-\beta)+\beta$ in the first row. Since $|\alpha|$ is so much smaller than $|\beta|$ , this a recipe for losing digits to subtractive cancellation.

Example 2.3.3

It’s easy to get just the lower triangular part of any matrix using the tril function.

A = randi(9, 5, 5);
L = tril(A)

We’ll set up and solve a linear system with this matrix.

b = ones(5, 1);
x = forwardsub(L, b)

It’s not clear how accurate this answer is. However, the residual should be zero or comparable to $\macheps$ .

b - L * x

Next, we’ll engineer a problem to which we know the exact answer.

alpha = 0.3;
beta = 2.2;
U = eye(5) + diag([-1 -1 -1 -1], 1);
U(1, [4, 5]) = [alpha - beta, beta]

x_exact = ones(5, 1);
b = [alpha; 0; 0; 0; 1];

Now we use backward substitution to solve for $\mathbf{x}$ , and compare to the exact solution we know already.

x = backsub(U, b);
err = x - x_exact

Everything seems OK here. But another example, with a different value for $\beta$ , is more troubling.

alpha = 0.3;
beta = 1e12;
U = eye(5) + diag([-1 -1 -1 -1], 1);
U(1, [4, 5]) = [alpha - beta, beta];
b = [alpha; 0; 0; 0; 1];

x = backsub(U, b);
err = x - x_exact

It’s not so good to get 4 digits of accuracy after starting with sixteen! The source of the error is not hard to track down. Solving for $x_1$ performs $(\alpha-\beta)+\beta$ in the first row. Since $|\alpha|$ is so much smaller than $|\beta|$ , this a recipe for losing digits to subtractive cancellation.

Example 2.3.3

It’s easy to get just the lower triangular part of any matrix using the tril function.

A = 1 + floor(9 * random.rand(5, 5))
L = tril(A)
print(L)

[[9. 0. 0. 0. 0.]
 [1. 7. 0. 0. 0.]
 [4. 5. 6. 0. 0.]
 [8. 9. 3. 8. 0.]
 [4. 2. 9. 7. 8.]]

We’ll set up and solve a linear system with this matrix.

b = ones(5)
x = FNC.forwardsub(L, b)
print(x)

[ 0.11111111  0.12698413 -0.01322751 -0.12400794  0.16108631]

It’s not clear how accurate this answer is. However, the residual should be zero or comparable to $\macheps$ .

b - L @ x

array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       1.11022302e-16])

Next we’ll engineer a problem to which we know the exact answer.

alpha = 0.3;
beta = 2.2;
U = diag(ones(5)) + diag([-1, -1, -1, -1], k=1)
U[0, 3:5] = [ alpha - beta, beta ]
print(U)

[[ 1.  -1.   0.  -1.9  2.2]
 [ 0.   1.  -1.   0.   0. ]
 [ 0.   0.   1.  -1.   0. ]
 [ 0.   0.   0.   1.  -1. ]
 [ 0.   0.   0.   0.   1. ]]

x_exact = ones(5)
b = array([alpha, 0, 0, 0, 1])
x = FNC.backsub(U, b)
print("error:", x - x_exact)

error: [2.22044605e-16 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00]

Everything seems OK here. But another example, with a different value for $\beta$ , is more troubling.

alpha = 0.3;
beta = 1e12;
U = diag(ones(5)) + diag([-1, -1, -1, -1], k=1)
U[0, 3:5] = [ alpha - beta, beta ]
b = array([alpha, 0, 0, 0, 1])

x = FNC.backsub(U, b)
print("error:", x - x_exact)

error: [-4.8828125e-05  0.0000000e+00  0.0000000e+00  0.0000000e+00
  0.0000000e+00]

It’s not so good to get 4 digits of accuracy after starting with sixteen! But the source of the error is not hard to track down. Solving for $x_1$ performs $(\alpha-\beta)+\beta$ in the first row. Since $|\alpha|$ is so much smaller than $|\beta|$ , this a recipe for losing digits to subtractive cancellation.

The example in Example 2.3.3 is our first clue that linear system problems may have large condition numbers, making inaccurate solutions inevitable in floating-point arithmetic. We will learn how to spot such problems in Conditioning of linear systems. Before reaching that point, however, we need to discuss how to solve general linear systems, not just triangular ones.

2.3.4Exercises¶

Exercise 2.3.5

Suppose a string is stretched with tension $\tau$ horizontally between two anchors at $x=0$ and $x=1$ . At each of the $n-1$ equally spaced positions $x_k=k/n$ , $k=1,\ldots,n-1$ , we attach a little mass $m_i$ and allow the string to come to equilibrium. This causes vertical displacement of the string. Let $q_k$ be the amount of displacement at $x_k$ . If the displacements are not too large, then an approximate force balance equation is

n \tau (q_k - q_{k-1}) + n\tau (q_k - q_{k+1}) = m_k g, \qquad k=1,\ldots,n-1,

where $g=-9.8$ m/s $^2$ is the acceleration due to gravity, and we define $q_0=0$ and $q_n=0$ due to the anchors. This defines a linear system for $q_1,\ldots,q_{n-1}$ .

(a) ✍ Show that the force balance equations can be written as a linear system $\mathbf{A}\mathbf{q}=\mathbf{f}$ , where $\mathbf{q}$ is a vector of the unknown displacements and $\mathbf{A}$ is a tridiagonal matrix (i.e., $A_{ij}=0$ if $|i-j|>1$ ) of size $(n-1)\times(n-1)$ .

(b) ⌨ Let $\tau=10$ N, and $m_k=(1/10n)$ kg for every $k$ . Using backslash, find the displacements when $n=8$ and $n=40$ , and superimpose plots of $\mathbf{q}$ over $0\le x \le 1$ for the two cases. (Be sure to include the zero values at $x=0$ and $x=1$ in your plots.)

(c) ⌨ Repeat (b) for the case $m_k = (k/5n^2)$ kg.

Exercise 2.3.6

⌨ If $\mathbf{B}\in\mathbb{R}^{n \times p}$ has columns $\mathbf{b}_1,\ldots,\mathbf{b}_p$ , then we can pose $p$ linear systems at once by writing $\mathbf{A} \mathbf{X} = \mathbf{B}$ , where $\mathbf{X}$ is $n\times p$ . Specifically, this equation implies $\mathbf{A} \mathbf{x}_j = \mathbf{b}_j$ for $j=1,\ldots,p$ .

(a) Modify Function 2.3.1 and Function 2.3.2 so that they solve the case where the second input is $n\times p$ for $p\ge 1$ .

(b) If $\mathbf{A} \mathbf{X}=\mathbf{I}$ , then $\mathbf{X}=\mathbf{A}^{-1}$ . Use this fact to write a function ltinverse that uses your modified forwardsub to compute the inverse of a lower triangular matrix. Test your function on at least two nontrivial matrices. (We remind you here that this is just an exercise; matrix inverses are rarely a good idea in numerical practice!)

Preface

Computing with matrices

Preface

LU factorization