Linear Algebra - [More] Math for Deep Learning

Useful resources to aid in the completion of Ronald T. Kneusel's Math for Deep Learning - Focusing on Linear Algebra Chapters (5 & 6)

Jan 18, 2025

I have recently been working through Ronald T. Kneusel’s Math for Deep Learning to aid in my understanding of neural network mathematical fundamentals. While a lot of the key topics have of course come up during previous studies or learning, I often find building from a solid foundation aids me in the discussion and understanding of more advanced topics (e.g when trying to grasp specific NN architectures). As such, working through Kneusel’s book has given me a lot of time to collate up ideas relating to each key chapter, and after spending the last 2-3 months on a linear algebra deep dive, I thought it might be fruitful to share some of the helpful guides and approaches I have found, in case they help you (or most likely, me in the future).

A note - I am making an assumption that you have already hit upon the Strang’s of the world, and it goes without saying how influential those works are. This is more of a collation of discrete, smaller resources that helped me grasp a topic within the context of linear algebra - or small exercises that refreshed my understanding (looking at you, eigenvectors). The bulk of this post will be focused on Chapter 6 ‘More Linear Algebra’ - as the topics within Chapter 5 - aptly named - ‘Linear Algebra’ are arguably relatively straight-forward to work through with Kneusel’s guide alone. I will however link to a few core resources at the end, if only to remind myself 2 years down the line.

Triangular and Identity Matrices, Gaussian Elimination & Linear Equations

These are somewhat subtly introduced by Kneusel, but grasping triangular matrices really enables for Gaussian elimination, which allows for you to solve systems of linear equations. Why is this useful? Well, on p.143, Kneusel introduces finding associated eigenvectors of values, and makes reference to needing to solve a system of linear equations. Learning about triangular matrices enables you to learn more about Gaussian elimination, which absolutely helps with solving these equations. To grasp triangular matrices and linear equations, I recommend Section D and E within the Matrices chapter of Kuldeep Singh’s Engineering Mathematics Through Applications. I also really found this Youtube video by Jeffrey Chasnov incredibly clear to follow:

This worksheet also has some questions that increase in difficulty on Gaussian Elimination.

Like triangular matrices, identity matrices are of course somewhat simple to understand, but having clarity on their usage within eigenvalues and eigenvectors certainly helps move Kneusel’s chapter along. For this - I recommend again Kuldeep Singh’s Engineering Mathematics Through Applications, but also HELM’s workbook on eigenvalues and eigenvectors.

Determinants

I personally found Kneusel’s determinant calculations for 3x3 matrices somewhat trickier to follow (p.134-p.137), therefore leaned much more on Singh’s description and workthroughs on 3x3 matrices on p.522. Calculating the minor and cofactor is really well explained, and there are numerous examples to work through. Of course, calculating 2x2 is very straight forward, but the 3x3 examples shown by Singh really helped me grasp the calculations.

Inverses

Calculating the inverse of matrix A is essentially A^-1 = 1/det(A) * adj(A), but in all honestly, it’s a fair few little steps when doing it manually across something bigger than a 2x2 matrix. Singh again introduces all of these steps really well, so for example adj(A) being the cofactor matrix of A transposed and with lots of small examples. Pages 528 - 534 give lots of exercises to work through, in different disciplines too.

Eigenvalues and Eigenvectors

I would say I am not a complete slouch when it comes to working through eigenvalues and eigenvectors, having obviously known about their usage in Principal Component Analysis and also working with them for exams in quantum computing modules. However, it had easily been a few years since I’d calculated them by hand, and once you’re in the rhythm of det(A - λI) it’s obviously fine - but ramping back into it honestly took me a little bit of time. As such, if you’re also looking to get your brain back in gear, I recommend the aforementioned HELM’s workbook on eigenvalues and eigenvectors. Theres a lot of great little exercises there. For a fun visual example, Maths is Fun is fantastic. Again, for the section on systems of equations, the linear examples given above will help you work through p.143 on eigenvectors.

Vector Norms and Distance Metrics

I think Kneusel’s section on L1 and L2 norm, or Lasso and Ridge regression respectively are really great and straight forward to follow. I did find though that linking it with the Datacamp grounding on their usage within machine learning helped refresh on the pros/cons of each when it comes to practical applications. This BuiltIn guide was also quite a good practical refresher.

Similarly, if you’re interested in definitions of the L-infinity norm. I really liked this not too rigorous way of thinking about it in the second reply.

Covariance Matrices, Mahalanobis Distance and Kullback-Leibler Divergence

I spent a fair amount of time working through covariance matrices, as grasping them made applying the Mahalanobis distance calculations particularly light work. For some intuition behind the topic, I really recommend Ritvikmath’s video:

And also, taking the time to sit down to create your own matrices in Numpy/by hand to work through the calculations. Once this is done, applying the Mahalanobis distance is pretty straight forward, and a good video explainer for it is available here:

Again, the section on Kullback-Leibler divergence is really strong from Kneusel, but if you’re looking for a really, really great way of seeing how KL isn’t symmetric, then absolute props to this Stats exchange post. And a reminder for future me: don't assume the log definition!

Principal Component Analysis

Kneusel’s chapter on More Linear Algebra really builds on the theoreticals for PCA, but I actually found this guide pretty cool for more of the intuition behind why we use it:

Similarly, there’s actually a fun ‘explaining PCA to the family’ post here, which is the gold standard in being able to elucidate on a topic clearly.

Vectors & Matrices Wrap-Up

As promised, there’s also a couple of resources here that I found useful for Chapter 5 - Linear Algebra. Dropping them below in case they are of use to you, with some small descriptions:

Vector workbook from HELM - Lots of great refresher material on vectors.

Multiplying matrices and vectors - A very clear guide with examples.

Tensorphobia and Outer Products - a great explainer

Vector projection - A really nice example using the idea of shining a light bulb and the idea of ‘finding the shadow’.

Orthogonal Matrices - Some worked examples and a definition of finding if a matrix is orthogonal or not.

Holly Emblem

Discussion about this post

Ready for more?