Conditioning of linear systems Tobin A. Driscoll Richard J. Braun
We are ready to consider the conditioning of solving the square linear system A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b . In this problem, the data are A \mathbf{A} A and b \mathbf{b} b , and the solution is x \mathbf{x} x . Both data and result are multidimensional, so we will use norms to measure their magnitudes.
The motivation for the definition of relative condition number in Chapter 1 was to quantify the response of the result to perturbations of the data. For simplicity, we start by allowing perturbations to b \mathbf{b} b only while A \mathbf{A} A remains fixed.
Let A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b be perturbed to
A ( x + h ) = b + d . \mathbf{A}(\mathbf{x}+\mathbf{h}) = \mathbf{b}+\mathbf{d}. A ( x + h ) = b + d . The condition number should be the relative change in the solution divided by relative change in the data,
∥ h ∥ ∥ x ∥ ∥ d ∥ ∥ b ∥ = ∥ h ∥ ∥ b ∥ ∥ d ∥ ∥ x ∥ . \frac{\quad\dfrac{\| \mathbf{h} \| }{\| \mathbf{x} \| }\quad}{\dfrac{\| \mathbf{d} \| }{\| \mathbf{b} \|}}
= \frac{\| \mathbf{h} \|\;\| \mathbf{b} \| }{\| \mathbf{d} \|\; \| \mathbf{x} \| }. ∥ b ∥ ∥ d ∥ ∥ x ∥ ∥ h ∥ = ∥ d ∥ ∥ x ∥ ∥ h ∥ ∥ b ∥ . We can bound ∥ h ∥ \| \mathbf{h} \| ∥ h ∥ in terms of ∥ d ∥ \| \mathbf{d} \| ∥ d ∥ :
A x + A h = b + d , A h = d , h = A − 1 d , ∥ h ∥ ≤ ∥ A − 1 ∥ ∥ d ∥ , \begin{split}
\mathbf{A}\mathbf{x} + \mathbf{A} \mathbf{h} &= \mathbf{b} + \mathbf{d}, \\
\mathbf{A} \mathbf{h} &= \mathbf{d},\\
\mathbf{h} &= \mathbf{A}^{-1} \mathbf{d},\\
\| \mathbf{h} \| &\le \| \mathbf{A}^{-1}\| \,\| \mathbf{d} \|,
\end{split} Ax + Ah Ah h ∥ h ∥ = b + d , = d , = A − 1 d , ≤ ∥ A − 1 ∥ ∥ d ∥ , where we have applied A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b and (2.7.8) .
Since also b = A x \mathbf{b}=\mathbf{A}\mathbf{x} b = Ax implies ∥ b ∥ ≤ ∥ A ∥ ∥ x ∥ \| \mathbf{b} \|\le
\| \mathbf{A} \|\, \| \mathbf{x} \| ∥ b ∥ ≤ ∥ A ∥ ∥ x ∥ , we derive
∥ h ∥ ∥ b ∥ ∥ d ∥ ∥ x ∥ ≤ ( ∥ A − 1 ∥ ∥ d ∥ ) ( ∥ A ∥ ∥ x ∥ ) ∥ d ∥ ∥ x ∥ = ∥ A − 1 ∥ ∥ A ∥ . \frac{\| \mathbf{h} \|\; \| \mathbf{b} \|}{\| \mathbf{d} \|\; \| \mathbf{x} \|}
\le \frac{\bigl(\| \mathbf{A}^{-1} \|\, \| \mathbf{d} \|\bigr)
\bigl(\| \mathbf{A} \|\,\| \mathbf{x} \|\bigr)}{\| \mathbf{d} \|\,\| \mathbf{x} \|}
= \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|. ∥ d ∥ ∥ x ∥ ∥ h ∥ ∥ b ∥ ≤ ∥ d ∥ ∥ x ∥ ( ∥ A − 1 ∥ ∥ d ∥ ) ( ∥ A ∥ ∥ x ∥ ) = ∥ A − 1 ∥ ∥ A ∥. It is possible to show that this bound is tight, in the sense that the inequalities are in fact equalities for some choices of b \mathbf{b} b and d \mathbf{d} d . This result motivates a new definition.
Definition 2.8.1 (Matrix condition number)
The matrix condition number of an invertible square matrix A \mathbf{A} A is
κ ( A ) = ∥ A − 1 ∥ ∥ A ∥ . \kappa(\mathbf{A}) = \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|. κ ( A ) = ∥ A − 1 ∥ ∥ A ∥. This value depends on the choice of norm; a subscript on κ such as 1, 2, or ∞ is used if clarification is needed. If A \mathbf{A} A is singular, we define κ ( A ) = ∞ \kappa(\mathbf{A}) = \infty κ ( A ) = ∞ .
2.8.1 Main result ¶ The matrix condition number (2.8.5) is equal to the condition number of solving a linear system of equations. Although we derived this fact above only for perturbations of b \mathbf{b} b , a similar statement holds when A \mathbf{A} A is perturbed.
Using a traditional Δ notation for the perturbation in a quantity, we can write the following.
Theorem 2.8.1 (Conditioning of linear systems)
If A ( x + Δ x ) = b + Δ b \mathbf{A}(\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b} + \Delta \mathbf{b} A ( x + Δ x ) = b + Δ b , then
∥ Δ x ∥ ∥ x ∥ ≤ κ ( A ) ∥ Δ b ∥ ∥ b ∥ . \frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{b} \|}{\| \mathbf{b} \|}. ∥ x ∥ ∥Δ x ∥ ≤ κ ( A ) ∥ b ∥ ∥Δ b ∥ . If ( A + Δ A ) ( x + Δ x ) = b (\mathbf{A}+\Delta \mathbf{A}) (\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b} ( A + Δ A ) ( x + Δ x ) = b , then
∥ Δ x ∥ ∥ x ∥ ≤ κ ( A ) ∥ Δ A ∥ ∥ A ∥ , \frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{A} \|}{\| \mathbf{A} \|}, ∥ x ∥ ∥Δ x ∥ ≤ κ ( A ) ∥ A ∥ ∥Δ A ∥ , in the limit ∥ Δ A ∥ → 0 \| \Delta \mathbf{A} \| \to 0 ∥Δ A ∥ → 0 .
Note that for any induced matrix norm,
1 = ∥ I ∥ = ∥ A A − 1 ∥ ≤ ∥ A ∥ ∥ A − 1 ∥ = κ ( A ) . 1 = \| \mathbf{I} \| = \| \mathbf{A} \mathbf{A}^{-1} \| \le \| \mathbf{A} \|\, \| \mathbf{A}^{-1} \| = \kappa(\mathbf{A}). 1 = ∥ I ∥ = ∥ A A − 1 ∥ ≤ ∥ A ∥ ∥ A − 1 ∥ = κ ( A ) . A condition number of 1 is the best we can hope for—in that case, the relative perturbation of the solution has the same size as that of the data. A condition number of size 1 0 t 10^t 1 0 t indicates that in floating-point arithmetic, roughly t t t digits are lost (i.e., become incorrect) in computing the solution x \mathbf{x} x . And if κ ( A ) > ϵ mach − 1 \kappa(\mathbf{A}) > \epsilon_\text{mach}^{-1} κ ( A ) > ϵ mach − 1 , then for computational purposes the matrix is effectively singular.
Example 2.8.1 (Matrix condition number)
Example 2.8.1 Julia has a function cond
to compute matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 × 6 6\times 6 6 × 6 case.
Tip
Type \kappa
followed by Tab to get the Greek letter κ.
A = [ 1 / (i + j) for i in 1:6, j in 1:6 ]
κ = cond(A)
Because κ ≈ 1 0 8 \kappa\approx 10^8 κ ≈ 1 0 8 , it’s possible to lose nearly 8 digits of accuracy in the process of passing from A \mathbf{A} A and b \mathbf{b} b to x \mathbf{x} x . That fact is independent of the algorithm; it’s inevitable once the data are expressed in finite precision.
Let’s engineer a linear system problem to observe the effect of a perturbation. We will make sure we know the exact answer.
6-element Vector{Float64}:
4.407142857142857
3.564285714285714
3.013095238095238
2.6174603174603175
2.317279942279942
2.0807359307359308
Now we perturb the system matrix and vector randomly by 10-10 in norm.
# type \Delta then Tab to get Δ
ΔA = randn(size(A)); ΔA = 1e-10 * (ΔA / opnorm(ΔA));
Δb = randn(size(b)); Δb = 1e-10 * normalize(Δb);
We solve the perturbed problem using pivoted LU and see how the solution was changed.
new_x = ((A + ΔA) \ (b + Δb))
Δx = new_x - x
6-element Vector{Float64}:
-1.0292247959120537e-5
0.00018114988628825657
-0.0009677286543698926
0.0021779214696993066
-0.002177284214154085
0.0007979410058291947
Here is the relative error in the solution.
@show relative_error = norm(Δx) / norm(x);
relative_error = norm(Δx) / norm(x) = 0.00034909676680967795
And here are upper bounds predicted using the condition number of the original matrix.
println("Upper bound due to b: $(κ * norm(Δb) / norm(b))")
println("Upper bound due to A: $(κ * opnorm(ΔA) / opnorm(A))")
Upper bound due to b: 0.0006723667714371329
Upper bound due to A: 0.004566989873939066
Even if we didn’t make any manual perturbations to the data, machine roundoff does so at the relative level of ϵ mach \macheps ϵ mach .
Δx = A\b - x
@show relative_error = norm(Δx) / norm(x);
@show rounding_bound = κ * eps();
relative_error = norm(Δx) / norm(x) = 7.822650774976615e-10
rounding_bound = κ * eps() = 1.134607141116935e-8
Larger Hilbert matrices are even more poorly conditioned:
A = [ 1 / (i + j) for i=1:14, j=1:14 ];
κ = cond(A)
Note that κ exceeds 1 / ϵ mach 1/\macheps 1/ ϵ mach . In principle we therefore may end up with an answer that has relative error greater than 100%.
Let’s put that prediction to the test.
x = 1:14
b = A * x
Δx = A\b - x
@show relative_error = norm(Δx) / norm(x);
relative_error = norm(Δx) / norm(x) = 4.469466154206132
As anticipated, the solution has zero accurate digits in the 2-norm.
Example 2.8.1 MATLAB has a function cond
to compute matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 × 6 6\times 6 6 × 6 case.
A = hilb(6)
kappa = cond(A)
Because κ ≈ 1 0 8 \kappa\approx 10^8 κ ≈ 1 0 8 , it’s possible to lose nearly 8 digits of accuracy in the process of passing from A \mathbf{A} A and b \mathbf{b} b to x \mathbf{x} x . That fact is independent of the algorithm; it’s inevitable once the data are expressed in finite precision.
Let’s engineer a linear system problem to observe the effect of a perturbation. We will make sure we know the exact answer.
Now we perturb the system matrix and vector randomly by 10-10 in norm.
dA = randn(size(A)); dA = 1e-10 * (dA / norm(dA));
db = randn(size(b)); db = 1e-10 * (db / norm(db));
We solve the perturbed problem using pivoted LU and see how the solution was changed.
new_x = ((A + dA) \ (b + db));
dx = new_x - x;
Here is the relative error in the solution.
relative_error = norm(dx) / norm(x)
And here are upper bounds predicted using the condition number of the original matrix.
upper_bound_b = (kappa * norm(db) / norm(b))
upper_bound_A = (kappa * norm(dA) / norm(A))
Even if we didn’t make any manual perturbations to the data, machine roundoff does so at the relative level of ϵ mach \macheps ϵ mach .
dx = A\b - x;
relative_error = norm(dx) / norm(x)
rounding_bound = kappa * eps
Larger Hilbert matrices are even more poorly conditioned:
A = hilb(14);
kappa = cond(A)
Note that κ exceeds 1 / ϵ mach 1/\macheps 1/ ϵ mach . In principle we therefore may end up with an answer that has relative error greater than 100%.
rounding_bound = kappa * eps
Let’s put that prediction to the test.
x = (1:14)'; b = A * x;
dx = A\b - x;
relative_error = norm(dx) / norm(x)
As anticipated, the solution has zero accurate digits in the 2-norm.
Example 2.8.1 The function cond
from numpy.linalg
is used to computes matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 × 6 6\times 6 6 × 6 case.
A = array([
[1/(i + j + 2) for j in range(6)]
for i in range(6)
])
print(A)
[[0.5 0.33333333 0.25 0.2 0.16666667 0.14285714]
[0.33333333 0.25 0.2 0.16666667 0.14285714 0.125 ]
[0.25 0.2 0.16666667 0.14285714 0.125 0.11111111]
[0.2 0.16666667 0.14285714 0.125 0.11111111 0.1 ]
[0.16666667 0.14285714 0.125 0.11111111 0.1 0.09090909]
[0.14285714 0.125 0.11111111 0.1 0.09090909 0.08333333]]
from numpy.linalg import cond
kappa = cond(A)
print(f"kappa is {kappa:.3e}")
Next we engineer a linear system problem to which we know the exact answer.
x_exact = 1.0 + arange(6)
b = A @ x_exact
Now we perturb the data randomly with a vector of norm 10-12 .
dA = random.randn(6, 6)
dA = 1e-12 * (dA / norm(dA, 2))
db = random.randn(6)
db = 1e-12 * (db / norm(db, 2))
We solve the perturbed problem using built-in pivoted LU and see how the solution was changed.
x = linalg.solve(A + dA, b + db)
dx = x - x_exact
Here is the relative error in the solution.
print(f"relative error is {norm(dx) / norm(x_exact):.2e}")
relative error is 1.80e-05
And here are upper bounds predicted using the condition number of the original matrix.
print(f"b_bound: {kappa * 1e-12 / norm(b):.2e}")
print(f"A_bound: {kappa * 1e-12 / norm(A, 2):.2e}")
b_bound: 6.72e-06
A_bound: 4.57e-05
Even if we don’t make any manual perturbations to the data, machine epsilon does when we solve the linear system numerically.
x = linalg.solve(A, b)
print(f"relative error: {norm(x - x_exact) / norm(x_exact):.2e}")
print(f"rounding bound: {kappa / 2**52:.2e}")
relative error: 4.24e-10
rounding bound: 1.13e-08
Because κ ≈ 1 0 8 \kappa\approx 10^8 κ ≈ 1 0 8 , it’s possible to lose 8 digits of accuracy in the process of passing from A A A and b b b to x x x . That’s independent of the algorithm; it’s inevitable once the data are expressed in double precision.
Larger Hilbert matrices are even more poorly conditioned.
A = array([ [1/(i+j+2) for j in range(14)] for i in range(14) ])
kappa = cond(A)
print(f"kappa is {kappa:.3e}")
Before we compute the solution, note that κ exceeds 1/eps
. In principle we therefore might end up with an answer that is completely wrong (i.e., a relative error greater than 100%).
print(f"rounding bound: {kappa / 2**52:.2e}")
x_exact = 1.0 + arange(14)
b = A @ x_exact
x = linalg.solve(A, b)
We got an answer. But in fact, the error does exceed 100%:
print(f"relative error: {norm(x - x_exact) / norm(x_exact):.2e}")
2.8.2 Residual and backward error ¶ Suppose that A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b and x ~ \tilde{\mathbf{x}} x ~ is a computed estimate of the solution x \mathbf{x} x . The most natural quantity to study is the error, x − x ~ \mathbf{x}-\tilde{\mathbf{x}} x − x ~ . Normally we can’t compute it because we don’t know the exact solution. However, we can compute something related.
Obviously, a zero residual means that x ~ = x \tilde{\mathbf{x}}=\mathbf{x} x ~ = x , and we have the exact solution. What happens more generally? Note that A x ~ = b − r \mathbf{A}\tilde{\mathbf{x}}=\mathbf{b}-\mathbf{r} A x ~ = b − r . That is, x ~ \tilde{\mathbf{x}} x ~ solves the linear system problem for a right-hand side that is changed by − r -\mathbf{r} − r . This is precisely what is meant by backward error.
Hence residual and backward error are the same thing for a linear system. What is the connection to the (forward) error? We can reconnect with (2.8.6) by the definition h = x ~ − x \mathbf{h} = \tilde{\mathbf{x}}-\mathbf{x} h = x ~ − x , in which case
d = A ( x + h ) − b = A h = − r . \mathbf{d} = \mathbf{A}(\mathbf{x}+\mathbf{h})-\mathbf{b}=\mathbf{A}\mathbf{h} = -\mathbf{r}. d = A ( x + h ) − b = Ah = − r . Thus, (2.8.6) is equivalent to
∥ x − x ~ ∥ ∥ x ∥ ≤ κ ( A ) ∥ r ∥ ∥ b ∥ . \frac{\| \mathbf{x}-\tilde{\mathbf{x}} \|}{\| \mathbf{x} \|} \le
\kappa(\mathbf{A}) \frac{\| \mathbf{r} \|}{\| \mathbf{b} \|}. ∥ x ∥ ∥ x − x ~ ∥ ≤ κ ( A ) ∥ b ∥ ∥ r ∥ . Equation (2.8.11) says that the gap between relative error and the relative residual is a multiplication by the matrix condition number.
When solving a linear system, all that can be expected is that the backward error, not the error, is small.
2.8.3 Exercises ¶ ⌨ Refer to Example 2.8.1 for the definition of a Hilbert matrix. Make a table of the values of κ ( H n ) \kappa(\mathbf{H}_n) κ ( H n ) in the 2-norm for n = 2 , 3 , … , 16 n=2,3,\ldots,16 n = 2 , 3 , … , 16 . Speculate as to why the growth of κ appears to slow down at n = 13 n=13 n = 13 .
⌨ The purpose of this problem is to verify, like in Example 2.8.1 , the error bound
∥ x − x ∥ ~ ∥ x ∥ ≤ κ ( A ) ∥ h ∥ ∥ b ∥ . \frac{\| \mathbf{x}-\tilde{\mathbf{x} \|}}{\| \mathbf{x} \|} \le \kappa(\mathbf{A})
\frac{\| \mathbf{h} \|}{\| \mathbf{b} \|}. ∥ x ∥ ∥ x − x ∥ ~ ≤ κ ( A ) ∥ b ∥ ∥ h ∥ . Here x ~ \tilde{\mathbf{x}} x ~ is a numerical approximation to the exact solution x \mathbf{x} x , and h \mathbf{h} h is an unknown perturbation caused by machine roundoff. We will assume that ∥ d ∥ / ∥ b ∥ \| \mathbf{d} \|/\| \mathbf{b} \| ∥ d ∥/∥ b ∥ is roughly eps()
.
For each n = 10 , 20 , … , 70 n=10,20,\ldots,70 n = 10 , 20 , … , 70 let A = matrixdepot("prolate",n,0.4)
and let x \mathbf{x} x have components x k = k / n x_k=k/n x k = k / n for k = 1 , … , n k=1,\ldots,n k = 1 , … , n . Define b=A*x
and let x ~ \tilde{\mathbf{x}} x ~ be the solution produced numerically by backslash.
Make a table including columns for n n n , the condition number of A \mathbf{A} A , the observed relative error in x ~ \tilde{\mathbf{x}} x ~ , and the right-hand side of the inequality above. You should find that the inequality holds in every case.
⌨ Exercise 2.3.7 suggests that the solutions of linear systems
A = [ 1 − 1 0 α − β β 0 1 − 1 0 0 0 0 1 − 1 0 0 0 0 1 − 1 0 0 0 0 1 ] , b = [ α 0 0 0 1 ] \mathbf{A} = \begin{bmatrix} 1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 &
0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 & 1
\end{bmatrix}, \quad
\mathbf{b} = \begin{bmatrix} \alpha \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} A = ⎣ ⎡ 1 0 0 0 0 − 1 1 0 0 0 0 − 1 1 0 0 α − β 0 − 1 1 0 β 0 0 − 1 1 ⎦ ⎤ , b = ⎣ ⎡ α 0 0 0 1 ⎦ ⎤ become less accurate as β increases. Using α = 0.1 \alpha=0.1 α = 0.1 and β = 10 , 100 , … , 1 0 12 \beta=10,100,\ldots,10^{12} β = 10 , 100 , … , 1 0 12 , make a table with columns for β, ∣ x 1 − 1 ∣ |x_1-1| ∣ x 1 − 1∣ , and the condition number of the matrix.
⌨ Let A n \mathbf{A}_n A n denote the ( n + 1 ) × ( n + 1 ) (n+1)\times(n+1) ( n + 1 ) × ( n + 1 ) version of the Vandermonde matrix in Equation (2.1.3) based on the equally spaced interpolation nodes t i = i / n t_i=i/n t i = i / n for i = 0 , … , n i=0,\ldots,n i = 0 , … , n . Using the 1-norm, graph κ ( A n ) \kappa(\mathbf{A}_n) κ ( A n ) as a function of n n n for n = 4 , 5 , 6 , … , 20 n=4,5,6,\ldots,20 n = 4 , 5 , 6 , … , 20 , using a log scale on the y y y -axis. (The graph is nearly a straight line.)
⌨ The matrix A \mathbf{A} A in (2.6.2) has unpivoted LU factors given in (2.6.3) as a function of parameter ε. For ϵ = 1 0 − 2 , 1 0 − 4 , … , 1 0 − 10 \epsilon = 10^{-2},10^{-4},\ldots,10^{-10} ϵ = 1 0 − 2 , 1 0 − 4 , … , 1 0 − 10 , make a table with columns for ε, κ ( A ) \kappa(\mathbf{A}) κ ( A ) , κ ( L ) \kappa(\mathbf{L}) κ ( L ) , and κ ( U ) \kappa(\mathbf{U}) κ ( U ) . (This shows that solution via unpivoted LU factorization is arbitrarily unstable.)
✍ Define A n \mathbf{A}_n A n as the n × n n\times n n × n matrix [ 1 − 2 1 − 2 ⋱ ⋱ 1 − 2 1 ] . \displaystyle\begin{bmatrix}
1 & -2 & & &\\
& 1 & -2 & & \\
& & \ddots & \ddots & \\
& & & 1 & -2 \\
& & & & 1
\end{bmatrix}. ⎣ ⎡ 1 − 2 1 − 2 ⋱ ⋱ 1 − 2 1 ⎦ ⎤ .
(a) Write out A 2 − 1 \mathbf{A}_2^{-1} A 2 − 1 and A 3 − 1 \mathbf{A}_3^{-1} A 3 − 1 .
(b) Write out A n − 1 \mathbf{A}_n^{-1} A n − 1 in the general case n > 1 n>1 n > 1 . (If necessary, look at a few more cases in Julia until you are certain of the pattern.) Make a clear argument why it is correct.
(c) Using the ∞-norm, find κ ( A n ) \kappa(\mathbf{A}_n) κ ( A n ) .
✍ (a) Prove that for n × n n\times n n × n nonsingular matrices A \mathbf{A} A and B \mathbf{B} B , κ ( A B ) ≤ κ ( A ) κ ( B ) \kappa(\mathbf{A}\mathbf{B})\le \kappa(\mathbf{A})\kappa(\mathbf{B}) κ ( AB ) ≤ κ ( A ) κ ( B ) .
(b) Show by means of an example that the result of part (a) cannot be an equality in general.
✍ Let D \mathbf{D} D be a diagonal n × n n\times n n × n matrix, not necessarily invertible. Prove that in the 1-norm,
κ ( D ) = max i ∣ D i i ∣ min i ∣ D i i ∣ . \kappa(\mathbf{D}) = \frac{\max_i |D_{ii}|}{\min_i |D_{ii}|}. κ ( D ) = min i ∣ D ii ∣ max i ∣ D ii ∣ . (Hint: See Exercise 2.7.10 .)