from numpy import *
from scipy import linalg
from scipy.linalg import norm
from matplotlib.pyplot import *
from prettytable import PrettyTable
from timeit import default_timer as timer
import sys
sys.path.append('fncbook/')
import fncbook as FNC
# This (optional) block is for improving the display of plots.
# from IPython.display import set_matplotlib_formats
# set_matplotlib_formats("svg","pdf")
# %config InlineBackend.figure_format = 'svg'
rcParams["figure.figsize"] = [7, 4]
rcParams["lines.linewidth"] = 2
rcParams["lines.markersize"] = 4
rcParams['animation.html'] = "jshtml" # or try "html5"
The power and inverse iterations have a flaw that seems obvious once it is pointed out. Given a seed vector u, they produce a sequence of vectors u1,u2,… that are scalar multiples of u,Au,A2u,…, but only the most recent vector is used to produce an eigenvector estimate.
It stands to reason that we could do no worse, and perhaps much better, if we searched among all linear combinations of the vectors seen in the past. In other words, we seek a solution in the range (column space) of the matrix
As we have seen with the power iteration, part of the appeal of the Krylov matrix is that it can be generated in a way that fully exploits the sparsity of A, simply through repeated matrix-vector multiplication. Furthermore, we have some important mathematical properties.
The problems Ax=b and Ax=λx are posed in a very high-dimensional space Rn or Cn. One way to approximate them is to replace the full n-dimensional space with a much lower-dimensional Km for m≪n. This is the essence of the Krylov subspace approach.
For instance, we can interpret Axm≈b in the sense of linear least-squares—that is, using Theorem 8.4.1 to let x=Kmz,
The natural seed vector for Km in this case is the vector b. In the next example we try to implement (8.4.4). We do take one precaution: because the vectors Akb may become very large or small in norm, we normalize after each multiplication by A, just as we did in the power iteration.
The breakdown of convergence in Example 8.4.1 is due to a critical numerical defect in our approach: the columns of the Krylov matrix (8.4.1) increasingly become parallel to the dominant eigenvector, as (8.2.5) predicts, and therefore to one another. As we saw in The QR factorization, near-parallel vectors create the potential for numerical cancellation. This manifests as a large condition number for Km as m grows, eventually creating excessive error when solving the least-squares system.
The polar opposite of an ill-conditioned basis for Km is an orthonormal one. Suppose we had a thin QR factorization of Km:
Since we started by assuming that we know q1,…,qm, the only unknowns in (8.4.6) are Hm+1,m and qm+1. But they appear only as a product, and we know that qm+1 is a unit vector, so they are uniquely defined (up to sign) by the other terms in the equation.
We can now proceed iteratively.
The vectors q1,…,qm are an orthonormal basis for the space Km, which makes them ideal for numerical computations in that space. The matrix Hm plays an important role, too, and has a particular “triangular plus” structure,
An implementation of the Arnoldi iteration is given in Function 8.4.1. A careful inspection shows that inner nested loop does not exactly implement (8.4.7) and (8.4.8). The reason is numerical stability. Though the described and implemented versions are mathematically equivalent in exact arithmetic (see Exercise 8.4.6), the approach in Function 8.4.1 is more stable.
In the next section, we revisit the idea of approximately solving Ax=b over a Krylov subspace as suggested in Example 8.4.2. A related idea explored in Exercise 8.4.7 is used to approximate the eigenvalue problem for A, which is the approach that underlies eigs for sparse matrices.