Skip to article frontmatterSkip to article content

Computing with matrices

At a reductive level, a matrix is a table of numbers that obeys certain algebraic laws. But matrices are pervasive in scientific computation, mainly because they represent linear operations on vectors. Moreover, vectors go far beyond the three-dimensional representations of physical quantities you learned about in calculus.

2.2.1Notation

We use capital letters in bold to refer to matrices, and lowercase bold letters for vectors. All named vectors in this book are column vectors. The bold symbol 0\boldsymbol{0} may refer to a vector of all zeros or to a zero matrix, depending on context; we use 0 as the scalar zero only.

To refer to a specific element of a matrix, we use the uppercase name of the matrix without boldface, as in A24A_{24} to mean the (2,4)(2,4) element of A\mathbf{A}.[1] To refer to an element of a vector, we use just one subscript, as in x3x_3. If you see a boldface character with one or more subscripts, then you know that it is a matrix or vector that belongs to a sequence or indexed collection.

We will have frequent need to refer to the individual columns of a matrix as vectors. Our convention is to use a lowercase bold version of the matrix name with a subscript to represent the column number. Thus, a1,a2,,an\mathbf{a}_1,\mathbf{a}_2,\ldots,\mathbf{a}_n are the columns of the m×nm\times n matrix A\mathbf{A}. Conversely, whenever we define a sequence of vectors v1,,vp\mathbf{v}_1,\ldots,\mathbf{v}_p, we can implicitly consider them to be columns of a matrix V\mathbf{V}. Sometimes we might write V=[vj]\mathbf{V}=\bigl[ \mathbf{v}_j \bigr] to emphasize the connection.

The notation AT\mathbf{A}^T is used for the transpose of a matrix, in which the rows and columns switch places. In the case of complex matrices, the complex conjugate[2] becomes involved with this operation.

If A\mathbf{A} is real, then A=AT\mathbf{A}^*=\mathbf{A}^T.

2.2.2Block matrix expressions

We will often find it useful to break a matrix into separately named pieces. For example, we might write

A=[A11A12A13A21A22A23],B=[B1B2B3]. \mathbf{A} = \begin{bmatrix} \mathbf{A}_{11} & \mathbf{A}_{12} & \mathbf{A}_{13} \\ \mathbf{A}_{21} & \mathbf{A}_{22} & \mathbf{A}_{23} \end{bmatrix}, \qquad \mathbf{B} = \begin{bmatrix} \mathbf{B}_1 \\ \mathbf{B}_2 \\ \mathbf{B}_3 \end{bmatrix}.

It’s understood that blocks that are on top of one another have the same number of columns, and blocks that are side by side have the same number of rows. Typically, if the blocks all have compatible dimensions, then they can be multiplied as though the blocks were scalars. For instance, continuing with the definitions above, we say that A\mathbf{A} is block-2×32\times 3 and B\mathbf{B} is block-3×13\times 1, so we can write

AB=[A11B1+A12B2+A13B3A21B1+A22B2+A23B3], \mathbf{A} \mathbf{B} = \begin{bmatrix} \mathbf{A}_{11}\mathbf{B}_1 + \mathbf{A}_{12}\mathbf{B}_2 + \mathbf{A}_{13}\mathbf{B}_3 \\ \mathbf{A}_{21}\mathbf{B}_1 + \mathbf{A}_{22}\mathbf{B}_2 + \mathbf{A}_{23}\mathbf{B}_3 \end{bmatrix},

provided that the individual block products are well-defined. For transposes, we have, for example,

AT=[A11TA21TA12TA22TA13TA23T]. \mathbf{A}^T = \begin{bmatrix} \mathbf{A}_{11}^T & \mathbf{A}_{21}^T \\[2mm] \mathbf{A}_{12}^T & \mathbf{A}_{22}^T \\[2mm] \mathbf{A}_{13}^T & \mathbf{A}_{23}^T \end{bmatrix}.

2.2.3Vector and matrix basics

Vectors and matrices are integral to scientific computing. All modern languages provide ways to work with them beyond manipulation of individual elements.

2.2.4Row and column operations

A critical identity in matrix multiplication is

Aej=aj. \mathbf{A} \mathbf{e}_j = \mathbf{a}_j.

Furthermore, the expression

A[e1e3e5] \mathbf{A} \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_3 & \mathbf{e}_5 \end{bmatrix}

reproduces three columns. An equivalent expression in Julia would be A[:,1:2:5].

We can extend the same idea to rows by using the general identity (RS)T=STRT(\mathbf{R}\mathbf{S})^T=\mathbf{S}^T\mathbf{R}^T. Let B=AT\mathbf{B}=\mathbf{A}^T have columns [bj]\bigl[ \mathbf{b}_j \bigr], and note

(bj)T=(Bej)T=ejTBT=ejTA. (\mathbf{b}_j)^T = (\mathbf{B} \mathbf{e}_j)^T = \mathbf{e}_j^T \mathbf{B}^T = \mathbf{e}_j^T \mathbf{A}.

But ejT\mathbf{e}_j^T is the jjth row of I\mathbf{I}, and bjT\mathbf{b}_j^T is the transpose of the jjth column of B\mathbf{B}, which is the jjth row of A\mathbf{A} by B=AT\mathbf{B}=\mathbf{A}^T. Thus, multiplication on the left by row jj of the identity extracts the jjth row. Extracting the single element (i,j)(i,j) from the matrix is, therefore, eiTAej\mathbf{e}_i^T \mathbf{A} \mathbf{e}_j.

Being able to extract specific rows and columns of a matrix via algebra makes it straightforward to do row- and column-oriented operations, such as linear combinations.

2.2.5Exercises

Footnotes
  1. This aspect of our notation is slightly unusual. More frequently one would see the lowercase a24a_{24} in this context. We feel that our notation lends more consistency and clarity to expressions with mixed symbols, and it is more like how computer code is written.

  2. The conjugate of a complex number is found by replacing all references to the imaginary unit ii by i-i.