G-Engine #8: Matrices

Matrices are vital tools for 3D rendering. Graphics libraries expect you to use matrices to represent positions, rotations, and scales of 3D objects. Furthermore, matrices provide a convenient/effective mechanism for representing hierarchies of 3D objects and coordinate systems.

This post will briefly explain what matrices are, explain commonly used operations for 3D game development, and provide tips for writing matrix classes for your game engine.

What is a Matrix?

A matrix is a 2D collection of numbers. A defining characteristic of a matrix is its size: the number of rows and columns it has.

Size is expressed using “row x column” form - for example, a 2x4 matrix has 2 rows and 4 columns:

| 0 8 9 -2 |
| 3 2 4 10 |

In a 3D game engine, we’ll commonly deal with 3x3 and 4x4 matrices. Here’s an example of a 3x3 matrix:

| 5 3 2 |
| 6 1 0 |
| 4 3 2 |

And here’s an example of a 4x4 matrix:

| 2 6 0 0 |
| 4 4 8 3 |
| 2 0 1 1 |
| 4 9 5 7 |

Referencing Elements

Each number in a matrix is called an element (or an entry). Elements are identified using row and column indexes, where element (0, 0) is in the top-left corner. Rows proceed downward and columns proceed to the right.

For example, in the above 4x4 matrix, the number 8 is at element (1, 2) and the number 9 is at element (3, 1).

Here’s a 4x4 matrix showing each element’s row/column indexes:

| (0, 0) (0, 1) (0, 2) (0, 3) |
| (1, 0) (1, 1) (1, 2) (1, 3) |
| (2, 0) (2, 1) (2, 2) (2, 3) |
| (3, 0) (3, 1) (3, 2) (3, 3) |

Be careful of this “gotcha”: the notation to identify each matrix element may seem similar to cartesian coordinates. However, it is exactly opposite: the first number represents the row (“y axis”) and the second represents the column (“x axis”).

Matrix Terminology

Numerous terms exist to describe matrices with various attributes: particular sizes, particular numbers as elements, etc. Here, I’ll quickly highlight a few of the most common ones.

A matrix with the same number of rows and columns is called a square matrix. A 4x4 matrix is square; a 2x5 is not.

The diagonal of a matrix are the entries where the row and column are equal, such as (0, 0) or (1, 1) or (2, 2). A diagonal matrix only has non-zero entries on the diagonal of the matrix. Here’s an example of a diagonal matrix:

| 5 0 0 |
| 0 1 0 |
| 0 0 2 |

The trace of a matrix is the sum of the elements along the diagonal. The trace of the above matrix is 8.

The Identity matrix is a diagonal matrix with only the number 1 present. This matrix has a special purpose when used in multiplication, which we’ll describe later. Here is a 3x3 Identity matrix:

| 1 0 0 |
| 0 1 0 |
| 0 0 1 |

Why Matrices?

Perhaps you learned about matrices in high school and wondered what practical application they’d have…well, 3D graphics, as it turns out! They are vital for expressing certain data to 3D graphics systems.

In my vector post, we briefy discussed how a 3D coordinate system is defined by an origin and three axes (x/y/z). We can actually compactly and efficiently represent a 3D coordinate system’s origin and three axes in matrix form:

// First column: x-axis direction vector
// Second column: y-axis direction vector
// Third column: z-axis direction vector
// Fourth column: origin position vector
| xx yx zx tx |
| xy yy zy ty |
| xz yz zz tz |
| 0  0  0  1  |

We’ll discuss this more in a future post - I bring it up here just to highlight why matrices are used for 3D graphics data.

Using just one coordinate system, we can position all the objects in our game world. But if we create many coordinate systems, and if we nest them within one another, we can create hierarchies of objects (sometimes referred to as object parenting or a transform hierarchy).

A matrix representing a coordinate system also represents a transformation from that coordinate system to the parent coodinate system. We can use matrix multiplication to transform vectors and points between coordinate systems, which is helpful both for 3D rendering and for gameplay programming.

However, we’re getting a little ahead of ourselves! Before we can do cool stuff with matrices in our game engine, we need to know how to use matrices, and we need to write classes so we can use matrices in code.

Matrix Operations

There are just a few matrix operations that we need to worry about, but they are fairly complex. Here, I’ll aim to explain how these operations work at a surface level.

Matrix Addition and Subtraction

Matrices can be added and subtracted from one another - doing so simply adds or subtracts corresponding elements in the output matrix. Here’s a simply addition example:

| 4 6 |   | 2 2 |   | 6 8 |
| 2 5 | + | 7 1 | = | 9 6 |

You can only add or subtract matrices that are the same size.

Matrix addition and subtraction are not super useful in a 3D game engine. The real magic comes from matrix multiplication.

Matrix Multiplication

Multiplying matrices is not so simple. Rather than multiplying corresponding elements, you make use of the dot product to combine rows/columns of the two input matrices to generate elements in the output matrix.

We discussed the dot product previously, which allows us to take two vectors and generate a single number as an output. We can treat the individual rows or columns of a matrix as though they are vectors.

For example, to calculate entry (2, 3) of the output matrix, you take row 2 of the first input matrix and column 3 of the second input matrix. You treat each as vectors and perform the dot product. The result of the dot product is just a number, which we put at entry (2, 3) in the result matrix.

Here’s an even more concrete example.

| 4 1 2 |   | 3 2 3 |
| 3 9 5 | * | 9 8 1 | = ?
| 1 3 2 |   | 6 2 4 |

To calculate entry (0, 0) in the result matrix, we must take row 0 from the first matrix and column 0 from the second matrix and perform the dot product:

(4, 1, 2) * (3, 9, 6) = (4 * 3) + (1 * 9) + (2 * 6) = 12 + 9 + 12 = 33

So, the value at entry (0, 0) in our result matrix will be 33.

If we perform this process for EVERY result matrix entry, we end up with this result:

| 33  20 21 |
| 120 88 38 |
| 42  30 14 |

Size Restrictions

The above multiplication process is shown between two matrices of the same size. Can you multiply a 2x1 matrix and a 6x5 matrix? What size would the result matrix be?

There is a restriction on what matrices can be multiplied together:

The number of columns in the first input matrix must be equal to the number of rows in the second input matrix.

You can also derive the size of the resulting matrix:

The result matrix will always have the number of rows in the first input matrix and the number of columns in the second input matrix.

Here are several examples to illustrate this:

4x4 and 4x4: CAN multiply, output is 4x4
4x4 and 3x3: CAN NOT multiply
2x3 and 3x2: CAN multiply, output is 2x2
3x2 and 2x3: CAN multiply, output is 3x3
1x4 and 4x4: CAN multiply, output is 1x4
4x4 and 1x4: CAN NOT multiply
4x4 and 4x1: CAN multiply, output is 4x1

An easy way to remember this is the “inner/outer” rule, which you can see in the above examples: write the sizes of the two input matrices; if the inner numbers match, you can multiply; the output size matches the outer numbers.

Notice the last three examples: which matrix is on the left-hand side vs. right-hand side CAN have an effect on whether you can multiply! And flipping the rows/columns can make a previously impossible multiplication possible.

Identity Multiplication

The Identity matrix is special because it acts like the number 1 in normal multiplication: multiplying by the identity matrix has no effect!

| 5 2 |   | 1 0 |   | 5 2 |
| 4 9 | * | 0 1 | = | 4 9 |

Transposing

Transposing a matrix simply involves converting all rows to columns or columns to rows. Every matrix has a transpose. If you transpose a transpose, you get…the original matrix back!

For example, these two matrices are transposes of one another:

| 1 2 3 |  | 1 4 7 |
| 4 5 6 |  | 2 5 8 |
| 7 8 9 |  | 3 6 9 |

Vector Multiplication

The ability to multiply a vector by a matrix ends up being one of the most vital tools in our toolbox because it allows us to “transform” a vector (i.e. scale, rotate, or translate it).

When multiplying a vector by a matrix, we “pretend” that the vector is a matrix. For example, a 4D vector can be treated as either a 1x4 or a 4x1 matrix. These are referred to as row vectors and column vectors respectively.

After representing the vector as a matrix, we can then multiply the two using normal matrix multiplication. The output is a matrix, but we can interpret it as a vector.

For example, consider the 4D vector (3, 5, 8, 1). We can multiply it by a matrix in two possible ways.

Using a row vector:

              | 1 0 0 0 |
              | 0 1 0 0 |
| 3 5 7 1 | * | 0 0 1 0 | = | 5 7 9 1 |
              | 2 2 2 1 |

Using a column vector:

| 1 0 0 2 |   | 3 |
| 0 1 0 2 |   | 5 |
| 0 0 1 2 | * | 7 | = | 5 7 9 1 |
| 0 0 0 1 |   | 1 |

A couple interesting things to note about these operations:

  • Due to matrix multiplication size restrictions, the vector must be on the left side when using a row vector and it must be on the right side when using a column vector.
  • If the matrix is transposed when switching from row to column vector (or vice-versa), the result is exactly the same.

The 4x4 matrix used above is an example of a simple transformation matrix that moves a point by (2, 2, 2). Look at the input and output vectors, and you’ll see that is indeed the case: the input was (3, 5, 7) and the output was (5, 7, 9). Again, we’ll talk more about transformation matrices in a future post.

Inversion

Let’s say you multiply a vector v1 by matrix M to get vector v2. Does some matrix exist that would allow you to undo that operation?

v1 * M = v2 // Transforms v1 to v2
v2 * ? = v1 // Transforms v2 back to v1

This is called the inverse matrix of M, which undoes, or performs the opposite operations, of the original matrix. To calculate the inverse matrix, we must invert M.

Since an inverse matrix undoes the original matrix, the result of multiplying the two is the Identity matrix!

Not every matrix has an inverse. There are a few rules:

  • Only square matrices have inverses.
  • A matrix only has an inverse if its determinant is non-zero.

Calculating the determinant is a fairly complex process that gets more and more complex as the matrix gets larger. Fortunately, for game development purposes, we really only need to be able to calculate the determinant for 3D, 4D (and maybe 2D) matrices. Explaining the logic for calculating the determinant is best done with pictures, so I’ll delegate that responsibility to an existing resource (such as this page)

Actually calculating the inverse is also not a simple task! There are a couple options. First, there is a “by hand” method of calculating the inverse known as Gauss-Jordan Elimination, but this method is not ideal for writing code. Another option exists, but it requires us to define a few more terms (I know, it’s complicated).

Let’s say you have a 3x3 matrix M, and you remove row i and column j (thereby making it a 2x2 matrix). If you calculate the determinant of the 2x2 matrix, that value is called the cofactor of element (i,j) for the 3x3 matrix. If you do this for every row/column combination you end up with a cofactor matrix (where every value in the original matrix has been replaced by the associated cofactor value).

Now, furthermore, if you transpose the cofactor matrix, you end up with a matrix called the adjugate of M.

Given all that, the inverse of a matrix can be calculated as the adjugate matrix divided by the determinant! Whew! The introduction of the above terminology is probably a bit head-spinning at first. But it defines a step-by-step process we can use to calculate the inverse matrix.

This calculation also reinforces the fact that an inverse only exists if the determinant is non-zero. Since we must divide by the determinant to calculate the inverse, and we can’t divide by zero, it’s not possible to calculate the inverse of a matrix with a zero determinant.

Feel free to check out the inverse functions in the G-Engine source code. These make use of some optimizations covered in the book “Foundations of Game Engine Development, Volume 1: Mathematics” by Eric Lengyel, which I’d highly recommend checking out.

Writing Matrix Classes

Fortunately, a 3D game engine only really needs to support 3x3 and 4x4 matrices to be effective. Technically, a 3D game engine also makes use of single-row and single-column matrices (such as 1x3, 1x4, 3x1, 4x1). However, we can simply use our existing Vector3 and Vector4 classes for those!

A matrix class should support at least the following operations:

  • Construction, Copy Construction, and Assignment
  • Equality checks
  • Get/set elements via row/column notation
  • Get/set rows and columns as Vectors
  • Matrix multiplication
  • Vector multiplication (using row and column vectors)
  • Calculate transpose matrix
  • Calculate inverse matrix

The matrix classes used by G-Engine can be found here:

Next, I’ll highlight some considerations when writing matrix classes.

Naming

I decided to name the 3x3 and 4x4 classes Matrix3 and Matrix4 respectively. I’ve also seen Matrix4x4 or Matrix4D or even Mat4 used in some engines. There’s no right answer here, but our goal is to make the name clear without being overly verbose.

Data Layout

In code, our matrix elements can be representd as a float array. A 3x3 matrix has 9 values and a 4x4 matrix has 16 values.

A 2D array seems like a natural choice, but a 1D array can also be used. For example, consider the 16 values for a 4x4 array:

// 2D array approach
float mVals[4][4];

// 1D array approach
float mVals[16];

Both approaches lead to the same contiguous memory allocation. We can visualize how the 2D array elements correlate to 1D elements:

[0][0] [0][1] [0][2] [0][3] [1][0] [1][1] // contiguous 2D array indexes
[ 00 ] [ 01 ] [ 02 ] [ 03 ] [ 04 ] [ 05 ] // contiguous 1D array indexes

As you can see, element mVals[0][3] is right next to mVals[1][0] in memory.

Because these are fixed-size arrays, there is no overhead or performance loss that I’m aware of by using a 2D array over a 1D array.

One benefit of the 2D approach is not having to calculate indexes manually - the compiler can do it for you:

// Access element (3, 2) of a 2D array:
mVals[3][2] = 5;

// Access element (3, 2) of a 1D array:
mVals[3 * 4 + 2] = 5;

A 2D array may also appear in the debugger in a more easily readable format. For this reason, I’ve actually seen the 2D approach more frequently recommended. You may notice that G-Engine uses the 1D approach currently - I’m considering changing that.

As with vectors, a matrix class should ONLY contain the array data member - no other data should be stored inside the class! You’ll be using matrices a lot, and they need to be as lean and efficient as possible.

Column-Major and Row-Major Orderings

I think we can all agree that the first element in our C++ array (mVals[0] or mVals[0][0]) should correlate to element (0, 0) (upper-left corner) of the matrix. (Right?)

However, what should be stored at the second element of our C++ array (mVals[1] or mVals[0][1])? We could put elements (0, 1) or (1, 0) there.

In other words, should we store matrix columns or matrix rows contiguously in memory? These two options are referred to as column-major and row-major orderings respectively.

For example, here are how the elements of a 3x3 matrix correlate to array indexes in column-major ordering:

// Indices for 1D (left) and 2D (right) arrays using column-major ordering:
| 0 3 6 |  | 0,0 1,0 2,0 |
| 1 4 7 |  | 0,1 1,1 2,1 |
| 2 5 8 |  | 0,2 1,2 2,2 |

And here’s how they correlate when using row-major ordering:

// Indices for 1D (left) and 2D (right) arrays using row-major ordering:
| 0 1 2 |  | 0,0 0,1 0,2 |
| 3 4 5 |  | 1,0 1,1 1,2 |
| 6 7 8 |  | 2,0 2,1 2,2 |

When using a 1D array, the index that correlates to each matrix element changes depending on the convention you use. For example, element (1, 2) is index [7] in column-major and index [5] in row-major. When using a 2D array, the column-major approach requires you to swap the two numbers while row-major keeps them identical. For example, element (1, 2) is index [2][1] in column-major and index [1][2] in row-major.

OK, great! Now why would you choose one ordering or the other? WHO CARES!?

The main reason to choose one or the other relates to how valuable it is to use contiguous values in memory. For example, let’s say that you want to be able to extract a column from your matrix and interpret it as a Vector3. How you do that will depend on your data ordering:

// Let's assume we have a 3x3 matrix with values stored in a 1D array.

// If data is column-major, we can simply reinterpret memory directly.
// This is more efficent (no copies).
Vector3& GetFirstColumn() { return reinterpret_cast<Vector3&>(mVals[0]); }

// If data is row-major, we can't do that, so we must create a new vector.
Vector3 GetFirstColumn() { return Vector3(mVals[0], mVals[4], mVals[8]); }

Storing data in column-major order makes it more efficient to extract columns, and storing data in row-major order makes it more efficient to extract rows.

Later on, when we use matrices to represent coordinate system transforms, it will be helpful to be able to extract the rows or columns of a matrix efficiently. Whether rows or columns need to be extracted will depend on the order the data is stored in the matrix. And THAT depends on whether you choose to use column vectors or row vectors!

In the Vector Multiplication section, we demonstrated how multiplication works when using column vectors and row vectors. The data in the matrix is transposed depending on which you use:

  • When you use row vectors, the rows of the matrix are meaningful and convenient to be able to extract efficiently.
  • When you use column vectors, the columns of the matrix are meaningful and convenient to be able to extract efficiently.

In G-Engine, I chose to use “column vectors” as the convention when multiplying vectors by matrices. Therefore, it will be convenient to be able to extract columns from my matrices. Therefore, I chose to use column-major ordering for my matrix data.

Matrix Construction

To construct a matrix, you probably want a default constructor that just uses the default implementation. This is most efficient, but it’s important to realize that, since we don’t set the initial values of our matrix values array, it’ll contain garbage unless you assign the values. We’ll usually do this immediately after construction, so this is OK.

Another common constructor allows you to set each value of the matrix at construction time. It’s best to keep this straightforward: just provide an argument for each entry, and lay out the arguments in a way that visually makes sense, regardless of whether you are storing column-major or row-major under the hood.

Matrix3 mat(v00, v01, v02,
            v10, v11, v12,
            v20, v21, v22);

You may want to have a constructor that takes in a 2D array of values. If you do this, I’d still recommend that you allow values to be provided in a way that visually makes sense, regardless of internal storage ordering. But in my opinion, it’s not worth having a 2D array constructor for a matrix, especially if you use column-major ordering, because it gets confusing: should each “row” of the 2D array correspond to a column? Avoid the confusion and use the previous constructor instead.

Don’t Inherit

As with vectors, it may seem like Matrix4 could inherit from Matrix3, but that is not actually a good idea. Despite having similar data and similar functions, each class is unique enough to warrant not doing any sort of inheritance. We also do not want the overhead of virtual functions introduced into our matrix classes, which should be very lightweight!

Private Data Members

Unlike vectors, I think it is a good idea to keep matrix data private and make use of operator overloads and other accessor functions to access the data in meaningful ways.

Elements can be retrieved or modified using row/column notation by overloading C++ operator(). Rows or columns can be interpreted as Vector3 instances and returned by overloading operator[] or with functions like GetRows or GetColumns.

One nice thing about keeping matrix data private is that outside code can use the matrix class without having to worry about whether you used a 1D or 2D array or whether the data is in column-major or row-major ordering. You can even change the array dimension or ordering at a later time, and the code using your matrix will probably not have to change.

Graphics libraries are an exception to this rule and may need to access raw array data from the matrix class - this can be accomplished by providing a function that returns a pointer to the internal array. This is accomplished in G-Engine by providing an implicit conversion operator for float*.

Careful Naming to Signal Usage

As with vectors, be careful to name member functions in a way that signals whether the function modifies itself or returns a new object.

A good example might be matrix inversion functions: calling a function Inverse can imply that it calculates the inverse and returns it as a new matrix; calling a function Invert implies that the function actually modifies the matrix itself, rather than creating a new one.

Use of const suffixes on member functions can also help enforce this.

void Invert(); // Inverts the current matrix, modifying it directly
Matrix4 Inverse() const; // Calculates and returns a new matrix, can't modify instance

Another possible way to signal this is with static and non-static functions. G-Engine currently does this with Transpose functions:

void Transpose(); // Transposes the current matrix, modifying it directly.
static Matrix4 Transpose(const Matrix4& matrix); // Calculates transpose and returns as a new matrix.

Matrix I/O

As with vectors, it can be helpful to output a Matrix to the log for debugging or data recording. You can overload operator<< as a convenient way to do this.

Pre-Defined Matrices

As with vectors, it can be helpful to provide static access to some pre-defined matrices. The main one that’s helpful for our needs is the Identity matrix. Another potentially meaningful one is Zero.

Conclusion

With functioning Vector and Matrix classes, we are getting closer to having the ability to position objects and our camera’s point of view in a 3D virtual world.

Next time, we’ll dive into quaternions, which are the final foundational math class we’ll need before we can move on to fully representing coordinate systems and transforms for objects in our 3D world.

This post provides a high-level overview of matrices with a focus on what you need to know to write these classes for your own engine. If you want a more comprehensive and thorough explanation of this and many other math topics, I’d suggest one of these resources:

G-Engine  C++  Math 
comments powered by Disqus