Skip to article frontmatterSkip to article content

Numerical integration

In calculus, you learn that the elegant way to evaluate a definite integral is to apply the Fundamental Theorem of Calculus and find an antiderivative. The connection is so profound and pervasive that it’s easy to overlook that a definite integral is a numerical quantity existing independently of antidifferentiation. However, most conceivable integrands have no antiderivative in terms of familiar functions.

Numerical integration, which also goes by the older name quadrature, is performed by combining values of the integrand sampled at nodes. In this section we will assume equally spaced nodes using the definitions

ti=a+ih,h=ban,i=0,,n. t_i = a +i h, \quad h=\frac{b-a}{n}, \qquad i=0,\ldots,n.

Numerical integration formulas can be applied to sequences of data values even if no function is explicitly known to generate them. For our presentation and implementations, however, we assume that ff is known and can be evaluated anywhere.

A straightforward way to derive integration formulas is to mimic the approach taken for finite differences: find an interpolant and operate exactly on it.

5.6.1Trapezoid formula

One of the most important integration formulas results from integration of the piecewise linear interpolant (see Piecewise linear interpolation). Using the cardinal basis form of the interpolant in (5.2.3), we have

abf(x)dxabi=0nf(ti)Hi(x)dx=i=0nf(ti)[abHi(x)]dx.\int_a^b f(x) \, dx \approx \int_a^b \sum_{i=0}^n f(t_i) H_i(x)\, dx = \sum_{i=0}^n f(t_i) \left[ \int_a^b H_i(x)\right]\, dx.

Thus, we can identify the weights as wi=h1abHi(x)dxw_i = h^{-1} \int_a^b H_i(x)\, dx. Using areas of triangles, it’s trivial to derive that

wi={1,i=1,,n1,12,i=0,n.w_i = \begin{cases} 1, & i=1,\ldots,n-1,\\ \frac{1}{2}, & i=0,n. \end{cases}

Putting everything together, the resulting formula is

abf(x)dxTf(n)=h[12f(t0)+f(t1)+f(t2)++f(tn1)+12f(tn)].\begin{split} \int_a^b f(x)\, dx \approx T_f(n) &= h\left[ \frac{1}{2}f(t_0) + f(t_1) + f(t_2) + \cdots + f(t_{n-1}) + \frac{1}{2}f(t_n) \right]. \end{split}

Geometrically, as illustrated in Figure 5.6.1, the trapezoid formula sums of the areas of trapezoids approximating the region under the curve y=f(x)y=f(x).[1]

The trapezoid formula is the Swiss Army knife of integration formulas. A short implementation is given as Function 5.6.1.

Trapezoid formula for integration. The piecewise linear interpolant defines trapezoids that approximate the region under the curve.

Figure 5.6.1:Trapezoid formula for integration. The piecewise linear interpolant defines trapezoids that approximate the region under the curve.

Like finite-difference formulas, numerical integration formulas have a truncation error.

In Theorem 5.2.2 we stated that the pointwise error in a piecewise linear interpolant with equal node spacing hh is bounded by O(h2)O(h^2) as h0h\rightarrow 0. Using II to stand for the exact integral of ff and pp to stand for the piecewise linear interpolant, we obtain

ITf(n)=Iabp(x)dx=ab[f(x)p(x)]dx(ba)maxx[a,b]f(x)p(x)=O(h2).\begin{split} I - T_f(n) = I - \int_a^b p(x)\, dx &= \int_a^b \bigl[f(x)-p(x)\bigr] \, dx \\ &\le (b-a) \max_{x\in[a,b]} |f(x)-p(x)| = O(h^2). \end{split}

A more thorough statement of the truncation error is known as the Euler–Maclaurin formula,

abf(x)dx=Tf(n)h212[f(b)f(a)]+h4740[f(b)f(a)]+O(h6)=Tf(n)k=1B2kh2k(2k)![f(2k1)(b)f(2k1)(a)],\begin{split} \int_a^b f(x)\, dx &= T_f(n) - \frac{h^2}{12} \left[ f'(b)-f'(a) \right] + \frac{h^4}{740} \left[ f'''(b)-f'''(a) \right] + O(h^6) \\ &= T_f(n) - \sum_{k=1}^\infty \frac{B_{2k}h^{2k}}{(2k)!} \left[ f^{(2k-1)}(b)-f^{(2k-1)}(a) \right], \end{split}

where the B2kB_{2k} are constants known as Bernoulli numbers. Unless we happen to be fortunate enough to have a function with f(b)=f(a)f'(b)=f'(a), we should expect truncation error at second order and no better.

5.6.2Extrapolation

If evaluations of ff are computationally expensive, we want to get as much accuracy as possible from them by using a higher-order formula. There are many routes for doing so; for example, we could integrate a not-a-knot cubic spline interpolant. However, splines are difficult to compute by hand, and as a result different methods were developed before computers came on the scene.

Knowing the structure of the error allows the use of extrapolation to improve accuracy. Suppose a quantity A0A_0 is approximated by an algorithm A(h)A(h) with an error expansion

A0=A(h)+c1h+c2h2+c3h3+. A_0 = A(h) + c_1 h + c_2 h^2 + c_3 h^3 + \cdots.

Crucially, it is not necessary to know the values of the error constants ckc_k, merely that they exist and are independent of hh.

Using II for the exact integral of ff, the trapezoid formula has

I=Tf(n)+c2h2+c4h4+, I = T_f(n) + c_2 h^2 + c_4 h^{4} + \cdots,

as proved by the Euler–Maclaurin formula (5.6.9). The error constants depend on ff and can’t be evaluated in general, but we know that this expansion holds. For convenience, we recast the error expansion in terms of n=O(h1)n=O(h^{-1}):

I=Tf(n)+c2n2+c4n4+. I = T_f(n) + c_2 n^{-2} + c_4 n^{-4} + \cdots.

We now make the simple observation that

I=Tf(2n)+14c2n2+116c4n4+. I = T_f(2n) + \tfrac{1}{4} c_2 n^{-2} + \tfrac{1}{16} c_4 n^{-4} + \cdots.

It follows that if we combine (5.6.12) and (5.6.13) correctly, we can cancel out the second-order term in the error. Specifically, define

Sf(2n)=13[4Tf(2n)Tf(n)]. S_f(2n) = \frac{1}{3} \Bigl[ 4 T_f(2n) - T_f(n) \Bigr].

(We associate 2n2n rather than nn with the extrapolated result because of the total number of nodes needed.) Then

I=Sf(2n)+O(n4)=b4n4+b6n6+. I = S_f(2n) + O(n^{-4}) = b_4 n^{-4} + b_6 n^{-6} + \cdots.

The formula (5.6.14) is called Simpson’s formula, or Simpson’s rule. A different presentation and derivation are considered in Exercise 5.6.4.

Equation (5.6.15) is another particular error expansion in the form (5.6.10), so we can extrapolate again! The details change only a little. Considering that

I=Sf(4n)=116b4n4+164b6n6+, I = S_f(4n) = \tfrac{1}{16} b_4 n^{-4} + \tfrac{1}{64} b_6 n^{-6} + \cdots,

the proper combination this time is

Rf(4n)=115[16Sf(4n)Sf(2n)], R_f(4n) = \frac{1}{15} \Bigl[ 16 S_f(4n) - S_f(2n) \Bigr],

which is sixth-order accurate. Clearly the process can be repeated to get eighth-order accuracy and beyond. Doing so goes by the name of Romberg integration, which we will not present in full generality.

5.6.3Node doubling

Note in (5.6.17) that Rf(4n)R_f(4n) depends on Sf(2n)S_f(2n) and Sf(4n)S_f(4n), which in turn depend on Tf(n)T_f(n), Tf(2n)T_f(2n), and Tf(4n)T_f(4n). There is a useful benefit realized by doubling of the nodes in each application of the trapezoid formula. As shown in Figure 5.6.2, when doubling nn, only about half of the nodes are new ones, and previously computed function values at the other nodes can be reused.

Dividing the node spacing by half introduces new nodes only at midpoints, allowing the function values at existing nodes to be reused for extrapolation.

Figure 5.6.2:Dividing the node spacing by half introduces new nodes only at midpoints, allowing the function values at existing nodes to be reused for extrapolation.

Specifically, we have

Tf(2m)=12m[12f(a)+12f(b)+i=12m1f(a+i2m)]=12m[12f(a)+12f(b)]+12mk=1m1f(a+2k2m)+12mk=1mf(a+2k12m)=12m[12f(a)+12f(b)+k=1m1f(a+km)]+12mk=1mf(a+2k12m)=12Tf(m)+12mk=1m1f(t2k1),\begin{split} T_f(2m) & = \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b) + \sum_{i=1}^{2m-1} f\Bigl( a + \frac{i}{2m} \Bigr) \right]\\[1mm] & = \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b)\right] + \frac{1}{2m} \sum_{k=1}^{m-1} f\Bigl( a+\frac{2k}{2m} \Bigr) + \frac{1}{2m} \sum_{k=1}^{m} f\Bigl( a+\frac{2k-1}{2m} \Bigr) \\[1mm] &= \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b) + \sum_{k=1}^{m-1} f\Bigl( a+\frac{k}{m} \Bigr) \right] + \frac{1}{2m} \sum_{k=1}^{m} f\Bigl( a+\frac{2k-1}{2m} \Bigr) \\[1mm] &= \frac{1}{2} T_f(m) + \frac{1}{2m} \sum_{k=1}^{m-1} f\left(t_{2k-1} \right), \end{split}

where the nodes referenced in the last line are relative to n=2mn=2m. Hence in passing from n=mn=m to n=2mn=2m, new integrand evaluations are needed only at the odd-numbered nodes of the finer grid.

5.6.4Exercises

Footnotes
  1. Some texts distinguish between a formula for a single subinterval [tk1,tk][t_{k-1},t_k] and a composite formula that adds them up over the whole interval to get (5.6.5).

References
  1. Bailey, D. H., Jeyabalan, K., & Li, X. S. (2005). A Comparison of Three High-Precision Quadrature Schemes. Experimental Mathematics, 14(3), 317–329. 10.1080/10586458.2005.10128931