Linear Algebra for Data Science

7 min readAug 4, 2021

After learning about Machine Learning, Supervised, Unsupervised Learning, and all the “relevant” algorithms for Data Science, I got to know one thing which I did not realize before and it’s also not a part of the curriculum of some of the beginner’s data science courses is the Need of Linear Algebra at the very beginning of Data Science. In fact, the mathematics behind the algorithms can be understood well if you have some understanding of Linear Algebra. So it is imperative to have some insight.

If you think that Mathematics is not your cup of tea, or you have come a long way from your Graduation or Post Graduation Studies, then this is for you. I will just make it super simple that you would understand. You can always refer to related books and websites for a deeper understanding.

Let’s get started.

It all starts with a simple Matrix.

What’s a MATRIX?

A matrix is a 2D or rectangular array of numbers like the following:

It is arranged in the form of rows and columns. It is represented in the form as Aij where i represents the number of rows and j represents the number of columns.

Matrix rows and columns

A matrix with m rows and n columns is called an m × n matrix, or m-by-n matrix, while m and n are called its dimensions. For example, the matrix A above is a 2 × 2 matrix that can be represented as A22.

Types of Matrix:

Square Matrix: A matrix (array of values) with the same number of rows as columns. Below is a 3x3 matrix.

2. Identity Matrix: A Square Matrix that has 1s on its main diagonal and 0s everywhere else. Generally represented by I.

Identity Matrix

3. Diagonal Matrix: A Square Matrix that has 0 everywhere except the main diagonal.

Diagonal Matrix

4. Scalar Matrix: A Square Matrix that has a non-zero same number on the main diagonal and 0 everywhere else.

Scalar Matrix

5. Zero or Null Matrix: Zeros everywhere

Zero Matrix

6. Row Matrix: A Matrix with only one row

Row Matrix

7. Column Matrix: A Matrix with only a single column. Also called Vector.

Column Matrix

Some more actions on a Matrix

Matrix Addition: For the addition of two matrices, both of them should have the same dimensions. If A is a matrix of dimension mxn, matrix B should also be of dimension mxn. Add the numbers in the matching positions.

Matrix Subtraction: For the subtraction of two matrices also, both of them should have the same dimensions. If A is a matrix of dimension mxn, matrix B should also be of dimension mxn. Subtract the numbers in the matching positions.

Matrix Transpose: To “transpose” a matrix, swap the rows and columns. We put a “T” in the top right-hand corner to mean transpose. For an mxn Matrix, the Transpose will have the dimension as nxm.

Matrix Inverse: The Inverse of a Matrix A can be written as A-1.When we multiply a matrix by its inverse we get the Identity Matrix (which is like “1” for matrices)

A × A-1 = I

We can get the Inverse of the Matrix with code.

Matrix Multiplication with Scalar: Multiplying a matrix with a scalar or we can say a constant number. All the elements in the matrix will be multiplied with the same number.

Multiplying Matrix with scalar 2

Matrix Division with Scalar: Matrix division with a scalar is similar to matrix multiplication with a scalar. Here we are dividing each element with a non-zero scalar number.

Matrix Multiplication: Matrix multiplication of two Matrices can only happen with the column dimension of the first matrix matches with the row dimension of the second matrix i.e. if A and B are 2 matrices then for the dot product AB, the dimensions of A and B should be Amn and Bnp.

Matrix Matrix Multiplication

In the above matrices, we are calculating like this:

1*7 + 2*9 + 3*11 = 7 + 18 + 33 = 58.

Similarly for C12= 1*8 + 2*10 + 3*12 = 8 + 20 + 36 =64.

So, finally, we get:

Matrix Multiplication Properties

We will see two of the main properties:

Matrix Multiplication is not commutative.

As in arithmetic, we can see 2*5 = 5* 2. Whereas in Matrix, AB ≠ BA

Matrix Multiplication is associative.

(AB)C = A(BC). If we multiply D=AB and then multiply Result1=DC. Again, if we multiply E=BC and then we multiply Result2=AE, then Result1 should be equal to Result2.

More on Vectors

Eigenvectors and EigenValues: An eigenvector is a vector whose direction remains unchanged when a linear transformation is applied to it. Consider the image below in which three vectors are shown. The green square is only drawn to illustrate the linear transformation that is applied to each of these three vectors.

Eigenvectors (red) do not change direction when a linear transformation (e.g. scaling) is applied to them. Other vectors (yellow) do.

The transformation, in this case, is a simple scaling with factor 2 in the horizontal direction and factor 0.5 in the vertical direction, such that the transformation matrix A is defined as:

A vector

is then scaled by applying this transformation as

The above figure shows that the direction of some vectors (shown in red) is not affected by this linear transformation. These vectors are called eigenvectors of the transformation and uniquely define the square matrix A. This unique, deterministic relation is exactly the reason that those vectors are called ‘eigenvectors’ (Eigen means ‘specific’ in German).

In general, the eigenvector

of a matrix A is the vector for which the following holds:

where

is a scalar value called the ‘eigenvalue’. This means that the linear transformation A on vector

is completely defined by

Where we can see Linear Algebra in Data Science? Linear Algebra Applications:

One must learn linear algebra to be able to learn statistics. Especially multivariate statistics. Statistics and data analysis are other pillar field of mathematics to support machine learning. They are primarily concerned with describing and understanding data. As the mathematics of data, linear algebra has left its fingerprint on many related fields of mathematics, including statistics.
Linear algebra helps in creating better machine learning algorithms and visualizations.
Eigenvectors and EigenValues are of great importance in many techniques used in computer vision and machine learning, such as dimensionality reduction by means of PCA, or face recognition by means of EigenFaces.
Eigenvalues and eigenvectors are used to minimize data noise. We can also use the two to improve the efficiency of the tasks that are known to be computationally intensive. They can also be used to do away with overfitting.

There are many more. These are just a few.

I hope you liked the blog. Please share and comment.