clear all
format short
set(0, 'defaultaxesfontsize', 12)
set(0, 'defaultlinelinewidth', 1.5)
set(0, 'defaultFunctionLinelinewidth', 1.5)
set(0, 'defaultscattermarkerfacecolor', 'flat')
gcf;
set(gcf, 'Position', [0 0 600 350], 'Theme', 'light')
addpath ../FNC_matlab

Starting MATLAB ...

Executing ...

8.1.Sparsity and structure¶

Very large matrices cannot be stored all within primary memory of a computer unless they are sparse. A sparse matrix has structural zeros, meaning entries that are known to be exactly zero. For instance, the adjacency matrix of a graph has zeros where there are no links in the graph. To store and operate with a sparse matrix efficiently, it is not represented as an array of all of its values. There is a variety of sparse formats available; for the most part, you can imagine that the matrix is stored as triples $(i,j,A_{ij})$ for all the nonzero $(i,j)$ locations.

8.1.1Computing with sparse matrices¶

Most graphs with real applications have many fewer edges than the maximum possible $n^2$ for $n$ nodes. Accordingly, their adjacency matrices have mostly zero elements and should be represented sparsely.

Example 8.1.1 (Sparsity)

Here we load the adjacency matrix of a graph with 2790 nodes from a data file. Each node is a web page referring to Roswell, NM, and the edges represent links between web pages. (Credit goes to Panayiotis Tsaparas and the University of Toronto for making this data public.)

load roswelladj
a = whos('A')

We may define the density of $\mathbf{A}$ as the number of nonzeros divided by the total number of entries.

Tip

Use nnz to count the number of nonzeros in a sparse matrix.

sz = size(A);  n = sz(1);
density = nnz(A) / prod(sz)

The computer memory consumed by any variable can be discovered using whos. We can use it to compare the space needed for the sparse representation to its dense counterpart, that is, the space needed to store all the elements, whether zero or not.

F = full(A);
f = whos('F');
storage_ratio = f.bytes / a.bytes

Matrix-vector products are also much faster using the sparse form because operations with structural zeros are skipped.

x = randn(n,1);
tic, for i = 1:200, A*x; end
sparse_time = toc

tic, for i = 1:200, F*x; end
dense_time = toc

However, the sparse storage format in MATLAB is column-oriented. Operations on rows may take a lot longer than similar ones on columns.

v = A(:, 1000);
tic, for i = 1:n, A(:, i) = v; end
column_time = toc
r = v';
tic, for i = 1:n, A(i, :) = r; end
row_time = toc

Arithmetic operations such as +, -, *, and ^ respect and exploit sparsity if the matrix operands are sparse. However, matrix operations may substantially decrease the amount of sparsity, a phenomenon known as fill-in.

Example 8.1.2 (Fill-in of a sparse matrix)

Here is the adjacency matrix of a graph representing a small-world network, featuring connections to neighbors and a small number of distant contacts.

load smallworld.mat
G = graph(A);
plot(G, nodecolor='r')

Because each node connects to relatively few others, the adjacency matrix is quite sparse.

spy(A)

By Theorem 7.1.1, the entries of $\mathbf{A}^k$ give the number of walks of length $k$ between pairs of nodes, as with “k degrees of separation” within a social network. As $k$ grows, the density of $\mathbf{A}^k$ also grows.

clf
tiledlayout(2, 2)
for k = [2, 3, 4, 6]
    nexttile
    spy(A^k)
    title(sprintf("A^{%d}", k))
end

8.1.2Banded matrices¶

A particularly important type of sparse matrix is a banded matrix. Recall from Exploiting matrix structure that $\mathbf{A}$ has upper bandwidth $p$ if $j-i > p$ implies $A_{ij}=0$ , and lower bandwidth $q$ if $i-j > q$ implies $A_{ij}=0$ . We say the total bandwidth is $p+q+1$ . Banded matrices appear naturally in many applications where each element interacts directly with only a few neighbors.

Without pivoting, an LU factorization preserves bandwidth, but pivoting can change or destroy bandedness.

Example 8.1.3 (Banded matrices)

The spdiags function creates a sparse matrix given its diagonal elements. The main or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.

n = 50;
n = 50;
% Put constant values on 3 diagonals
A = spdiags([n, 1, 0.1], [-3, 0, 5], n, n);
% Put other values on 1st superdiagonal
A = spdiags(-(0:n-1)', 1, A);
full(A(1:7, 1:7))

Without pivoting, the LU factors have the same lower and upper bandwidth as the original matrix.

Tip

The sparse function converts any matrix to sparse form. But it’s usually better to construct a sparse matrix directly, as the standard form might not fit in memory.

[L, U] = lufact(A);
clf
subplot(1, 2, 1), spy(L), title('L')
subplot(1, 2, 2), spy(U), title(('U'));

However, if we introduce row pivoting, bandedness may be expanded or destroyed.

[L, U, p] = plufact(A);
subplot(1, 2, 1), spy(L(p, :)), title('L')
subplot(1, 2, 2), spy(U), title(('U'));

8.1.3Systems and eigenvalues¶

If given a sparse matrix, the backslash operator will automatically try a form of sparse-aware Cholesky or pivoted LU factorization. Depending on the sparsity pattern of the matrix, the time taken to solve the linear system may be well below the $O(n^3)$ needed in the general case.

For very large matrices, it’s unlikely that you will want to find all of its eigenvalues and eigenvectors. In Krylov subspaces we describe some of the math behind an algorithm that can find a selected number of eigenvalues of largest magnitude, lying to the extreme left or right, or nearest a given complex number.

Example 8.1.4 (Eigenvalues of sparse matrices)

The following generates a random sparse matrix with prescribed eigenvalues.

n = 4000;
density = 4e-4;
lambda = 1 ./ (1:n);
A = sprandsym(n, density, lambda);
clf,  spy(A)
title('Sparse symmetric matrix')

The eigs function finds a small number eigenvalues meeting some criterion. First, we ask for the 5 of largest (complex) magnitude.

[V, D] = eigs(A, 5);    % largest magnitude
1 ./ diag(D)            % should be 1, 2, 3, 4, 5

Now we find the 4 closest to the value 0.03 in the complex plane.

[V, D] = eigs(A, 4, 0.03);    % closest to 0.03
diag(D)

The time needed to solve a sparse linear system is not easy to predict unless you have some more information about the matrix. But it will typically be orders of magnitude faster than the dense version of the same problem.

x = 1 ./ (1:n)';  
b = A * x;
tic, sparse_err = norm(x - A\b), sparse_time = toc

F = full(A);
tic, dense_err = norm(x - F\b), dense_time = toc

8.1.4Exercises¶

Exercise 8.1.2

⌨ This problem requires the matrix used in Example 8.1.2. Download a data file by clicking the link.

load smallworld

(a) Find the density of $\mathbf{A}$ (number of nonzeros divided by total number of elements), $\mathbf{A}^2$ , $\mathbf{A}^4$ , and $\mathbf{A}^8$ . (You should find that it increases with the power of $\mathbf{A}$ .)

(b) The LU factors tend to at least partially retain sparsity. Find the density of the $\mathbf{L}$ and $\mathbf{U}$ factors of $\mathbf{A}$ using lufact (Algorithm 2.4.1). (If you get an error, convert the matrix to dense form first.)

(c) Repeat part (b) for the QR factorization using qrfact (Algorithm 3.4.1). (If you get an error, convert the matrix to dense form first.)

Exercise 8.1.3

⌨ One use of adjacency matrices is to analyze the links between members of a collection. Obtain the adjacency matrix $\mathbf{A}$ from Example 8.1.1 via the following:

Download roswell.mat by clicking the link and saving (you may need to fix the file name).

load roswell

The matrix catalogs the links between web sites related to the town of Roswell, NM, with $A_{ij}=1$ if and only if site $i$ links to site $j$ .

(a) Verify numerically that the matrix does not include any links from a site to itself.

(b) Verify numerically that $\mathbf{A}$ is not symmetric. (Thus, its graph is a directed one.)

(c) How many sites in the group are not pointed to by any other sites in the group?

(d) Which site points to the most other sites?

(e) Which site is pointed to the most by the other sites? This is a crude way to establish the most important site.

(f) There are 2790² possible ways to connect ordered pairs of sites. What fraction of these pairs is connected by a walk of links that is no greater than three in length?

Exercise 8.1.6

⌨ A matrix that arises from the Helmholtz equation for wave propagation can be specified using

A = FNC.poisson(n) - k^2*I;

where $k$ is a real parameter. Let $n=50$ .

(a) Let $k=1$ . What is the size of $\mathbf{A}$ ? What is its density?

(b) Still with $k=1$ , use eigs to find the four largest and four smallest (in magnitude) eigenvalues of $\mathbf{A}$ . (See Example 8.1.4 for examples.)

(c) The eigenvalues are all real. Find a value of $k$ so that $\mathbf{A}$ has exactly three negative eigenvalues.