Linear algebra is a field of mathematics that could be called the mathematics of data. It is undeniably a pillar of the field of machine learning, and many recommend it as a prerequisite subject to study prior to getting started in machine learning. This is misleading advice, as linear algebra makes more sense to a practitioner once they have a context of the applied machine learning process in which to interpret it. In this chapter, you will discover why machine learning practitioners should study linear algebra to improve their skills and capabilities as practitioners. After reading this chapter, you will know:
Before we go through the reasons that you should learn linear algebra, let’s start off by taking a small look at the reason why you should not. I think you should not study linear algebra if you are just getting started with applied machine learning.
I recommend a breadth-first approach to getting started in applied machine learning. I call this approach a results-first approach. It is where you start by learning and practicing the steps for working through a predictive modeling problem end-to-end (e.g. how to get results) with a tool (such as scikit-learn and Pandas in Python). This process then provides the skeleton and context for progressively deepening your knowledge, such as how algorithms work and eventually the math that underlies them. After you know how to work through a predictive modeling problem, let’s look at why you should deepen your understanding of linear algebra.
You need to be able to read and write vector and matrix notation. Algorithms are described in books, papers and on websites using vector and matrix notation. Linear algebra is the mathematics of data and the notation allows you to describe operations on data precisely with specific operators. You need to be able to read and write this notation. This skill will allow you to:
Further, programming languages such as Python offer efficient ways of implementing linear algebra notation directly. An understanding of the notation and how it is realized in your language or library will allow for shorter and perhaps more efficient implementations of machine learning algorithms.
In partnership with the notation of linear algebra are the arithmetic operations performed. You need to know how to add, subtract, and multiply scalars, vectors, and matrices. A challenge for newcomers to the field of linear algebra are operations such as matrix multiplication and tensor multiplication that are not implemented as the direct multiplication of the elements of these structures, and at first glance appear nonintuitive.
Again, most if not all of these operations are implemented efficiently and provided via API calls in modern linear algebra libraries. An understanding of how vector and matrix operations are implemented is required as a part of being able to effectively read and write matrix notation.
You must learn linear algebra in order to be able to learn statistics. Especially multivariate statistics. Statistics and data analysis are another pillar field of mathematics to support machine learning. They are primarily concerned with describing and understanding data. As the mathematics of data, linear algebra has left its fingerprint on many related fields of mathematics, including statistics.
In order to be able to read and interpret statistics, you must learn the notation and operations of linear algebra. Modern statistics uses both the notation and tools of linear algebra to describe the tools and techniques of statistical methods. From vectors for the means and variances of data, to covariance matrices that describe the relationships between multiple Gaussian variables. The results of some collaborations between the two fields are also staple machine learning methods, such as the Principal Component Analysis, or PCA for short, used for data reduction.
Building on notation and arithmetic is the idea of matrix factorization, also called matrix decomposition. You need to know how to factorize a matrix and what it means. Matrix factorization is a key tool in linear algebra and used widely as an element of many more complex operations in both linear algebra (such as the matrix inverse) and machine learning (least squares).
Further, there are a range of different matrix factorization methods, each with different strengths and capabilities, some of which you may recognize as ”machine learning” methods, such as Singular-Value Decomposition, or SVD for short, for data reduction. In order to read and interpret higher-order matrix operations, you must understand matrix factorization.
You need to know how to use matrix factorization to solve linear least squares. Linear algebra was originally developed to solve systems of linear equations. These are equations where there are more equations than there are unknown variables. As a result, they are challenging to solve arithmetically because there is no single solution as there is no line or plane can fit the data without some error. Problems of this type can be framed as the minimization of squared error, called least squares, and can be recast in the language of linear algebra, called linear least squares.
Linear least squares problems can be solved efficiently on computers using matrix operations such as matrix factorization. Least squares is most known for its role in the solution to linear regression models, but also plays a wider role in a range of machine learning algorithms. In order to understand and interpret these algorithms, you must understand how to use matrix factorization methods to solve least squares problems.
If I could give one more reason, it would be: because it is fun. Seriously. Learning linear algebra, at least the way I teach it with practical examples and executable code, is a lot of fun. Once you can see how the operations work on real data, it is hard to avoid developing a strong intuition for the methods. I am not alone in thinking that linear algebra can be fun if approached in the right way:
Learning linear algebra can also be a lot of fun. Readers will experience knowledge buzz as they learn about the connections between concepts, and it’s not uncommon to experience mind-expanding moments while studying this subject.
— Page ix, No Bullshit Guide To Linear Algebra, 2017.
In this chapter, you discovered why, as a machine learning practitioner, you should deepen your understanding of linear algebra. Specifically, you learned: