where Km is the Krylov matrix generated using A and the seed vector b. This method was unstable due to the poor conditioning of Km, which is a numerically poor basis for Km. Instead, we can use the orthonormal basis Qm generated by the Arnoldi iteration.
In Demo 8.4.1, we tried to implement the GMRES idea using the Krylov matrix. After some initial reduction in the norm of the residual, improvement stopped due to the poor conditioning of that matrix. We return to the linear system in that example now to show that a stable implementation based on the Arnoldi iteration continues to improve the residual nearly to machine precision.
Thanks to Theorem 8.4.1, minimization of ∥b−Ax∥ over Km+1 includes minimization over the subset Km. Hence:
Unfortunately, making other conclusive statements about the convergence of GMRES is not easy. Demo 8.5.1 shows the cleanest behavior: essentially, linear convergence down to the range of machine epsilon. But it is possible for the convergence to go through phases of sublinear and superlinear convergence as well. There is a strong dependence on the eigenvalues of the matrix, a fact we address with more precision and detail in the next section.
One of the practical challenges in GMRES is that as the dimension of the Krylov subspace grows, the number of new entries to be found in Hm and the total number of basis vectors grow, too. Consequently, both the work and the storage requirements are quadratic in m, which can become intolerable in some applications. For this reason, GMRES is often used with restarting.
Suppose x^ is an approximate solution of Ax=b. Then, if we set x=u+x^, we have A(u+x^)=b, or Au=b−Ax^. The conclusion is that if we get an approximate solution and compute its residual r=b−Ax^, then we need only to solve Au=r in order to get a correction to x^.[3]
Restarting guarantees a fixed upper bound on the per-iteration cost of GMRES. However, this benefit comes at a price. Even though restarting preserves progress made in previous iterations, the Krylov space information is discarded and the residual minimization process starts again over low-dimensional spaces. That can significantly retard or even stagnate the convergence.
Restarting creates a tradeoff between the number of iterations and the speed per iteration. It’s essentially impossible in general to predict the ideal restart location in any given problem, so one goes by experience and hopes for the best.
There are other ways to avoid the growth in computational effort as the GMRES/Arnoldi iteration proceeds. Three of the more popular variations are abbreviated CGS, BiCGSTAB, and QMR. We do not describe them in this book.
(b) Find the GMRES approximate solutions xm for m=1,2,3,4. (You should not try to do the Arnoldi iteration by hand. Instead, apply the result in (8.4.4) to get a tractable least-squares problem.)
✍ (Continuation of Exercise 8.4.3.) Show that if xm∈Km, then the residual b−Axm is equal to q(A)b, where q is a polynomial of degree at most m and q(0)=1. (This fact is a key one for many convergence results.)
✍ Explain why GMRES, in exact arithmetic, converges to the true solution in n iterations for an n×n matrix if rank(Kn)=n. (Hint: Consider how the algorithm is defined from first principles.)
and let the n-vector b have elements bi=i/n. For n=8,16,32,64, run Function 8.5.2 for m=n/2 iterations. On one semi-log graph, plot ∥rk∥/∥b∥ for all the cases. How does the convergence rate of GMRES seem to depend on n?
⌨ In this exercise you will see the strong effect the eigenvalues of the matrix may have on GMRES convergence. Let
let I be a 100×100 identity, and let Z be a 100×100 matrix of zeros. Also let b be a 200×1 vector of ones. You will use GMRES with restarts, as in Demo 8.5.2 (i.e., not the book’s version of gmres).
(a) Let A=[BZIB]. What are its eigenvalues (no computer required here)? Apply gmres with tolerance 10-10 for 100 iterations without restarts, and plot the residual convergence.
(b) Repeat part (a) with restarts every 20 iterations.
(c) Now let A=[BZI−B]. What are its eigenvalues? Repeat part (a). Which matrix is more difficult for GMRES? (Note: Even though this matrix is triangular, GMRES has no way of exploiting that fact.)
⌨ (Continuation of Exercise 8.3.5.) We again consider the n2×n2 sparse matrix defined by FNC.poisson(n). The solution of Ax=b may be interpreted as the deflection of a lumped membrane in response to a load represented by b.
(a) For n=10,15,20,25, let b be the vector of n2 ones and apply Function 8.5.2 for 50 iterations. On one semi-log graph, plot the four convergence curves ∥rm∥/∥b∥.
(b) For the case n=25 make a surface plot of x after reshaping it to a 25×25 matrix. It should look physically plausible (though upside-down for a weighted membrane).
This statement is not strictly correct for rare special cases of breakdown where the rank of Km is less than m. In that situation, some additional steps must be taken that we do not discuss here.