Skip to article frontmatterSkip to article content

Chapter 7

Examples

7.1 From matrix to insight

Example 7.1.4

Here we create an adjacency matrix for a graph on four nodes.

A = [0 1 0 0; 1 0 0 0; 1 1 0 1; 0 1 1 0]
4×4 Matrix{Int64}: 0 1 0 0 1 0 0 0 1 1 0 1 0 1 1 0

The graphplot function makes a visual representation of this graph.

using Plots, GraphRecipes
graphplot(A, names=1:4, markersize=0.2, arrow=6)
Loading...

Since this adjacency matrix is not symmetric, the edges are all directed, as indicated by the arrows. Here are the counts of all walks of length 3 in the graph:

A^3
4×4 Matrix{Int64}: 0 1 0 0 1 0 0 0 3 2 0 1 1 3 1 0

If the adjacency matrix is symmetric, the result is an undirected graph: all edges connect in both directions.

A = [0 1 1 0; 1 0 0 1; 1 0 0 0; 0 1 0 0]
graphplot(A, names=1:4, markersize=0.2)
Loading...
Example 7.1.5

The Images package has many functions for image manipulation, and TestImages has some standard images to play with.

using Images, TestImages
img = testimage("mandrill")
Loading...

The variable img is a matrix.

size(img)
(512, 512)

However, its entries are colors, not numbers.

img[100, 10]
Loading...

You can use eltype to find out the type of the elements of any array.

eltype(img)
RGB{N0f8}

It’s possible to extract matrices of red, green, and blue intensities, scaled from 0 to 1.

R, G, B = red.(img), green.(img), blue.(img);
@show minB, maxB = extrema(B);
(minB, maxB) = extrema(B) = (0.0N0f8, 1.0N0f8)

Or we can convert the pixels to gray, each pixel again scaled from 0 to 1.

Gray.(img)
Loading...

In order to do our usual operations, we need to tell Julia that we want to interpret the elements of the image matrix as floating-point values.

A = Float64.(Gray.(img))
A[1:4, 1:5]
4×5 Matrix{Float64}: 0.568627 0.219608 0.192157 0.34902 0.537255 0.454902 0.396078 0.156863 0.262745 0.352941 0.301961 0.447059 0.180392 0.180392 0.384314 0.278431 0.533333 0.372549 0.188235 0.32549

We can use Gray to reinterpret a matrix of floating-point values as grayscale pixels.

Gray.(reverse(A, dims=1))
Loading...

7.2 Eigenvalue decomposition

Example 7.2.2

The eigvals function returns a vector of the eigenvalues of a matrix.

A = π * ones(2, 2)
2×2 Matrix{Float64}: 3.14159 3.14159 3.14159 3.14159
λ = eigvals(A)
2-element Vector{Float64}: 0.0 6.283185307179586

If you want the eigenvectors as well, use eigen.

λ, V = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}} values: 2-element Vector{Float64}: 0.0 6.283185307179586 vectors: 2×2 Matrix{Float64}: -0.707107 0.707107 0.707107 0.707107
norm(A * V[:, 2] - λ[2] * V[:, 2])
0.0

Both functions allow you to sort the eigenvalues by specified criteria.

A = diagm(-2.3:1.7)
@show eigvals(A, sortby=real);
@show eigvals(A, sortby=abs);
eigvals(A, sortby = real) = [-2.3, -1.3, -0.3, 0.7, 1.7]

eigvals(A, sortby = abs) = 
[-0.3, 0.7, -1.3, 1.7, -2.3]

If the matrix is not diagonalizable, no message is given, but V will be singular. The robust way to detect that circumstance is via κ(V)\kappa(\mathbf{V}).

A = [-1 1; 0 -1]
λ, V = eigen(A)
Warning: Timed out waiting for `Base.active_repl_backend.ast_transforms` to become available. Autoloads will not work.
@ BasicAutoloads ~/.julia/packages/BasicAutoloads/08hIo/src/BasicAutoloads.jl:117
[ Info: If you have a slow startup file, consider moving `register_autoloads` to the end of it.
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}} values: 2-element Vector{Float64}: -1.0 -1.0 vectors: 2×2 Matrix{Float64}: 1.0 -1.0 0.0 2.22045e-16
cond(V)
9.007199254740991e15

Even in the nondiagonalizable case, AV=VD\mathbf{A}\mathbf{V} = \mathbf{V}\mathbf{D} holds.

opnorm(A * V - V * diagm(λ))
2.220446049250313e-16
Example 7.2.3

We first define a hermitian matrix. Note that the ' operation is the adjoint and includes complex conjugation.

n = 7
A = randn(n, n) + 1im * randn(n, n)
A = (A + A') / 2
7×7 Matrix{ComplexF64}: -1.41567+0.0im 0.259952+0.0330673im … -0.0599516-0.295516im 0.259952-0.0330673im 0.132139+0.0im -0.601534+0.879239im 0.075175-0.121319im -0.258324+1.22958im 0.411276-0.199403im 0.169925+0.384116im 1.09688-0.789896im 0.0573306-0.339173im -0.476698+0.360998im -0.763015+0.777213im 1.5254+0.982086im 1.07695+0.1078im 0.0467184-0.526594im … 0.168347+0.62283im -0.0599516+0.295516im -0.601534-0.879239im 1.16686+0.0im

We confirm that the matrix A\mathbf{A} is normal by checking that κ(V)=1\kappa(\mathbf{V}) = 1 (to within roundoff).

λ, V = eigen(A)
@show cond(V);
cond(V) = 1.0000000000000013

Now we perturb A\mathbf{A} and measure the effect on the eigenvalues. The Bauer–Fike theorem uses absolute differences, not relative ones.

ΔA = 1e-8 * normalize(randn(n, n) + 1im * randn(n, n))
λ̃ = eigvals(A + ΔA)
dist = minimum([abs(x - y) for x in λ̃, y in λ], dims=2)
7×1 Matrix{Float64}: 8.464841132570398e-10 1.1609513124907042e-9 5.447353210807529e-10 1.4721005415762032e-9 1.0100615823760982e-9 9.138103297967129e-10 1.3995224576690174e-9

As promised, the perturbations in the eigenvalues do not exceed the normwise perturbation to the original matrix.

Now we see what happens for a triangular matrix.

n = 20
x = 1:n
A = triu(x * ones(n)')
A[1:5, 1:5]
5×5 Matrix{Float64}: 1.0 1.0 1.0 1.0 1.0 0.0 2.0 2.0 2.0 2.0 0.0 0.0 3.0 3.0 3.0 0.0 0.0 0.0 4.0 4.0 0.0 0.0 0.0 0.0 5.0

This matrix is not especially close to normal.

λ, V = eigen(A)
@show cond(V);
cond(V) = 6.149906664929389e9

As a result, the eigenvalues can change by a good deal more.

ΔA = 1e-8 * normalize(randn(n, n) + 1im * randn(n, n))
λ̃ = eigvals(A + ΔA)
dist = minimum([abs(x - y) for x in λ̃, y in λ], dims=2)
BF_bound = cond(V) * norm(ΔA)
@show maximum(dist), BF_bound;
(maximum(dist), BF_bound) = (0.17446566048046158, 61.499066649293894)

If we plot the eigenvalues of many perturbations, we get a cloud of points that roughly represents all the possible eigenvalues when representing this matrix with single-precision accuracy.

using Plots
plt = scatter(λ, zeros(n), aspect_ratio=1)
for _ in 1:200
    ΔA = eps(Float32) * normalize(randn(n, n) + 1im * randn(n, n))
    λ̃ = eigvals(A + ΔA)
    scatter!(real(λ̃), imag(λ̃), m=1, color=:black)
end
plt
Loading...

The plot shows that some eigenvalues are much more affected than others. This situation is not unusual, but it is not explained by the Bauer–Fike theorem.

Example 7.2.4

Let’s start with a known set of eigenvalues and an orthogonal eigenvector basis.

D = diagm([-6, -1, 2, 4, 5])
V, R = qr(randn(5, 5))    # V is unitary
A = V * D * V'
5×5 Matrix{Float64}: -0.732868 2.28846 -1.1665 -1.18314 -3.82984 2.28846 1.16205 1.73426 2.5208 1.19941 -1.1665 1.73426 2.32097 -0.695172 1.47837 -1.18314 2.5208 -0.695172 1.31139 -0.605209 -3.82984 1.19941 1.47837 -0.605209 -0.061534
eigvals(A)
5-element Vector{Float64}: -5.999999999999998 -0.9999999999999996 2.0000000000000004 4.000000000000002 5.0

Now we will take the QR factorization and just reverse the factors.

Q, R = qr(A)
A = R * Q;

It turns out that this is a similarity transformation, so the eigenvalues are unchanged.

eigvals(A)
5-element Vector{Float64}: -6.000000000000001 -0.9999999999999999 1.9999999999999993 4.0000000000000036 4.999999999999995

What’s remarkable, and not elementary, is that if we repeat this transformation many times, the resulting matrix converges to D\mathbf{D}.

for k in 1:40
    Q, R = qr(A)
    A = R * Q
end
A
5×5 Matrix{Float64}: -6.0 0.0038263 9.92775e-6 -7.93295e-16 -1.39934e-15 0.0038263 4.99999 0.00249934 6.52215e-16 -2.56069e-17 9.92775e-6 0.00249934 4.00001 -3.98256e-13 9.06224e-16 2.07305e-19 1.09246e-15 -3.98413e-13 2.0 5.17128e-13 1.27344e-31 -1.24602e-27 5.43212e-25 5.17556e-13 -1.0

7.3 Singular value decomposition

Example 7.3.4

We verify some of the fundamental SVD properties using standard Julia functions from LinearAlgebra.

A = [i^j for i = 1:5, j = 0:3]
5×4 Matrix{Int64}: 1 1 1 1 1 2 4 8 1 3 9 27 1 4 16 64 1 5 25 125

To get only the singular values, use svdvals.

σ = svdvals(A)
4-element Vector{Float64}: 146.69715365883005 5.738569780953702 0.9998486640841027 0.11928082685241923

Here is verification of the connections between the singular values, norm, and condition number.

@show opnorm(A, 2);
@show σ[1];
opnorm(A, 2) = 146.69715365883005

σ[1] = 146.69715365883005
@show cond(A, 2);
@show σ[1] / σ[end];
cond(A, 2) = 1229.846887633767

σ[1] / σ[end] = 1229.846887633767

To get singular vectors as well, use svd. The thin form of the factorization is the default.

U, σ, V = svd(A);
@show size(U);
@show size(V);
size(U) = (5, 4)

size(V) = 
(4, 4)

We verify the orthogonality of the singular vectors as follows:

@show opnorm(U' * U - I);
@show opnorm(V' * V - I);
opnorm(U' * U - I) = 4.947764483928556e-16

opnorm(V' * V - I) = 
7.570241024277813e-16

7.4 Symmetry and definiteness

Example 7.4.1

We will use a symmetric matrix with a known EVD and eigenvalues equal to the integers from 1 to 20.

n = 20;
λ = 1:n
D = diagm(λ)
V, _ = qr(randn(n, n))   # get a random orthogonal V
A = V * D * V';

The Rayleigh quotient is a scalar-valued function of a vector.

R = x -> (x' * A * x) / (x' * x);

The Rayleigh quotient evaluated at an eigenvector gives the corresponding eigenvalue.

R(V[:, 7])
7.0000000000000036

If the input to he Rayleigh quotient is within a small δ\delta of an eigenvector, its output is within O(δ2)O(\delta^2) of the corresponding eigenvalue. In this experiment, we observe that each additional digit of accuracy in an approximate eigenvector gives two more digits to the eigenvalue estimate coming from the Rayleigh quotient.

δ = @. 1 ./ 10^(1:5)
eval_diff = zeros(size(δ))
for (k, delta) in enumerate(δ)
    e = randn(n)
    e = delta * e / norm(e)
    x = V[:, 7] + e
    eval_diff[k] = R(x) - 7
end
labels = ["perturbation δ", "δ²", "R(x) - λ"]
@pt :header=labels [δ δ .^ 2 eval_diff]
Loading...

7.5 Dimension reduction

Example 7.5.1

We make an image from some text, then reload it as a matrix.

using Plots, Images
plot(annotations=(0.5, 0.5, text("Hello world", 44, :center, :center)),
    grid=:none, frame=:none, size=(400, 150))
savefig("hello.png")
img = load("hello.png")
A = @. Float64(Gray(img))
Gray.(A)
Loading...

Next we show that the singular values decrease until they reach zero (more precisely, until they are about ϵmach\epsilon_\text{mach} times the norm of the matrix) at around k=45k=45.

U, σ, V = svd(A)
scatter(σ;
    xaxis=(L"i"),  yaxis=(:log10, L"\sigma_i"),
    title="Singular values")
Loading...

The rapid decrease suggests that we can get fairly good low-rank approximations.

plt = plot(layout=(2, 2), frame=:none, aspect_ratio=1, titlefontsize=10)
for i in 1:4
    k = 3i
    Ak = U[:, 1:k] * diagm(σ[1:k]) * V[:, 1:k]'
    plot!(Gray.(Ak), subplot=i, title="rank = $k")
end
plt
Loading...

Consider how little data is needed to reconstruct these images. For rank-9, for instance, we have 9 left and right singular vectors plus 9 singular values, for a compression ratio of better than 12:1.

m, n = size(A)
compression = m * n / (9 * (m + n + 1))
12.099213551119178
Example 7.5.2

The matrix in this data file describes the votes on bills in the 111th session of the United States Senate. (The data set was obtained from voteview.com.) Each row is one senator, and each column is a vote item.

using JLD2
@load "voting.jld2" A;

If we visualize the votes (yellow is “yea,” blue is “nay”), we can see great similarity between many rows, reflecting party unity.

heatmap(A;
    color=:viridis,  xlabel="bill",  ylabel="senator",
    title="Votes in 111th U.S. Senate")
Loading...

We use (7.5.4) to quantify the decay rate of the values.

U, σ, V = svd(A)
τ = cumsum(σ .^ 2) / sum(σ .^ 2)
scatter(τ[1:16];
    xaxis=("k"),  yaxis=(L"\tau_k"),
    title="Fraction of singular value energy")
Loading...

The first and second singular triples contain about 58% and 17%, respectively, of the energy of the matrix. All others have far less effect, suggesting that the information is primarily two-dimensional. The first left and right singular vectors also contain interesting structure.

scatter(U[:, 1], label="", layout=(1, 2),
    xlabel="senator",  title="left singular vector")
scatter!(V[:, 1], label="", subplot=2,
    xlabel="bill",  title="right singular vector")
Loading...

Both vectors have values greatly clustered near ±C\pm C for a constant CC. These can be roughly interpreted as how partisan a particular senator or bill was, and for which political party. Projecting the senators’ vectors into the first two V\mathbf{V}-coordinates gives a particularly nice way to reduce them to two dimensions. Political scientists label these dimensions partisanship and bipartisanship. Here we color them by actual party affiliation (also given in the data file): red for Republican, blue for Democrat, and yellow for independent.

x1 = A * V[:, 1];
x2 = A * V[:, 2];

@load "voting.jld2" Rep Dem Ind
Rep = vec(Rep);
Dem = vec(Dem);
Ind = vec(Ind);
scatter(x1[Dem], x2[Dem];
    color=:blue,  label="D",
    xaxis=("partisanship"),  yaxis=("bipartisanship"), 
    title="111th US Senate by voting record")
scatter!(x1[Rep], x2[Rep], color=:red, label="R")
scatter!(x1[Ind], x2[Ind], color=:yellow, label="I")
Loading...