Dot Product Intuition

math
Author

Madiyar Aitbayev

Published

January 13, 2025

Dot Product Intuition

Introduction

The dot product (or scalar product) is a simple yet powerful operation that is used in many places in Machine Learning and other fields. In this post, I will explain the geometric intuition behind the dot product. You need to have a basic grasp of trigonometry and vector algebra to follow this post.

I will need this explanation for my future posts.

Feel free to ask questions on my telegram channel

Recap

More beginner friendly explanations are available in the following resources:

Feel free to checkout the above or other resources first.

Assume we have two vectors \(\textbf{a}=[a_1, a_2, \cdots, a_n]\) and \(\textbf{b} = [b_1, b_2, \cdots, b_n]\), then there are two definitions of the dot product: algebraic and geometric.

Algebraic definition

The dot product of \(\textbf{a}\) and \(\textbf{b}\) is:

\[ \textbf{a} \cdot \textbf{b} = \sum_{i=1}^n{a_ib_i} = a_1b_1+a_2b_2+\cdots+a_nb_n \]

Geometric definition

The dot product of \(\textbf{a}\) and \(\textbf{b}\) is: \[ \textbf{a} \cdot \textbf{b} = \|\mathbf{a}\|\|\mathbf{b}\|\cos \theta \]

where \(\theta\) is the angle between a and b and \(\|\textbf{a}\|\) is the magnitude of a vector a.

The geometric definition gives us a few useful properties:

  • The dot product is zero when a and b are orthogonal, since \(\cos(90 \degree) = 0\)
  • The dot product is positive for acute angles and negative for obtuse, e.g., \(\cos(45\degree)\) or \(\cos(89\degree)\) are positive but \(\cos(180\degree), \cos(91\degree)\) are negative.
  • We can find the angle between vectors by \(\theta = \arccos(\frac{\textbf{a} \cdot \textbf{b}}{\|\mathbf{a}\|\|\mathbf{b}\|})\). A picture from Wikipedia:

Source Wikipedia

Equivalence?

Wait, how are the two dot product definitions the same? How is \(cos( \theta)\) related to such a straightforward sum of components? A few questions I asked when I first encountered the dot product. I accepted the fact and moved on with my life until today. I will try to understand myself and also explain by interactive visualizations.

Explanation

Let’s first simplify the problem as much as possible. Once we build up an intuition for simpler cases, we can come back and try to understand the general cases.

Simplified version

First, we will only consider the \(n=2\) case. Assume that we have 2D vectors: \(\textbf{a}=[a_x, a_y]\) and \(\textbf{b}=[b_x, b_y]\) with the dot product:

\[ \textbf{a} \cdot \textbf{b} = a_xb_x+a_yb_y=\|\textbf{a}\|\|\textbf{b}\|\cos \theta \]

where \(\|\textbf{a}\|=\sqrt{a_x^2+a_y^2}\) is the length of the vector and \(\theta\) is the angle between the vectors.

Second, scaling either \(\textbf{a}\) or \(\textbf{b}\) by a real number \(k\) does not invalidate the dot product equivalence. Let’s say we scale \(\textbf{b}\) by \(k\), then:

\[ \textbf{a} \cdot (k \textbf{b}) = a_x(kb_x)+a_y(kb_y)=\|\textbf{a}\|k\|\textbf{b}\|\cos \theta \]

With the above, we can simplify further by constraining \(\|\textbf{b}\| = 1\). If the equivalence holds for \(\|\textbf{b}\| = 1\), then it holds for any \(\|k\textbf{b}\|\).

A visualization for the setup we have so far:

We have the following items shown above:

  • The vector \(\textbf{a}\): it is interactive and visualized with blue; feel free to move the point \(\textbf{a}\) around.
  • The vector \(\textbf{b}\): it is visualized with red and \(\|\textbf{b}\|=1\).
  • \(\theta\): the angle from \(\textbf{b}\) to \(\textbf{a}\)

The geometric definition implies that the dot product is invariant under rotations. So, if we rotate the both vectors \(\textbf{a}\) and \(\textbf{b}\) by the same angle \(\alpha\), the dot product won’t change. It is not obvious from the algebraic definition, but a detailed proof is provided in the next section. Let’s trust this property for now until the next section, and simplify our problem for the third time.

We can rotate the both vectors such that \(\textbf{b}\) is aligned with the x-axis. Click the button below for the illustration:

We arrived at a simplifed version of the problem: a vector \(\textbf{a}=[a_x, a_y]\) and a fixed vector \(\textbf{b}=[1, 0]\). Let’s check if the definitions agree now. The algebraic dot product is:

\[ \textbf{a} \cdot \textbf{b} = a_xb_x+a_yb_y=a_x \]

and the geometric is:

\[ \textbf{a} \cdot \textbf{b} = \|\textbf{a}\|\|\textbf{b}\|\cos \theta = \|\textbf{a}\|\cos \theta \]

now, we have \(\textbf{a} \cdot \textbf{b} = a_x = \|\textbf{a}\|\cos \theta\). This is correct due to the polar coordinates. Alternatively, we can easily derive from the cosine definition. The proof is left as an exercise to the reader. You can use the visualization below to convince yourself:

We introduced a new vector \(\textbf{p}\) in the above visualization:

  • It is visualized in green and visually represents the dot product
  • It has coordinates \(\textbf{p}=[\|\textbf{a}\|\cos \theta, 0]=[a_x, 0]\):
    • Where \(p_x\) is the dot product as we derived above
  • The vector \(\textbf{p}\) is collinear with the vector \(\textbf{b}\)
  • Since the vectors \(\textbf{p}\) and \(\textbf{b}\) lie along the same line:
    • We can express \(\textbf{p}\) using vector arithmetic: \(\textbf{p}=(\|\textbf{a}\|\cos \theta)\textbf{b}\)
    • \(\|\textbf{a}\|\cos \theta\) is a real number, which means how many \(\textbf{b}\)`s are required to reach the vector \(\textbf{p}\)
    • You can visualize a 1D number line spanned by the vector \(\textbf{b}\), then \(\|\textbf{a}\|\cos \theta\) is the location of \(\textbf{p}\) in that 1D coordinate system.
  • \(p_x\) can be 0, positive or negative. When negative, the vector \(\textbf{p}\) is opposite to the vector \(\textbf{b}\)
  • The vector \(\textbf{p}\) is actually the projection of the vector \(\textbf{a}\) into \(\textbf{b}\).
  • Play with the point \(\textbf{a}\) to gain more intuition!

Try different values of \(\textbf{a}\) by moving it in the visualization, and verify the following questions:

  • When the dot product is zero?
  • When the dot product is negative ?
  • When the dot product is positive ?
  • What values of \(\textbf{a}\) result in the same dot product?

The simplified form of the dot product is quite useful for intuition. I visualize this version in my mind when I forget some details of the dot product. At this point, you should be comfortable with the dot product of the simplified form: \(\textbf{a} \cdot [1, 0]\).

Rotational invariance

First, let’s visualize the rotation:

There is a slider which represents the angle \(\alpha\) (in degrees) by which both vectors are rotated. Feel free to play with the slider! You can also interact with the point \(\textbf{a}\).

It is easy to see why the dot product is invariant under rotations from the geometric definition. The geometric definition relies only on the lengths of the vectors, and the lengths don’t change when rotated.

However, it is not immediately clear from the algebraic definition. To prove it, we derive an alternative formula for the algebraic dot product:

\[ \begin{align} \|\textbf{a}-\textbf{b}\|^2 &= (a_x-b_x)^2 + (a_y-b_y)^2 \\ &= a_x^2 + b_x^2 - 2a_xb_x + a_y^2 + b_y^2 - 2a_yb_y \\ &= (a_x^2 + a_y^2) + (b_x^2 + b_y^2) - 2(a_xb_x + a_yb_y) \\ &= \|\textbf{a}\|^2 + \|\textbf{b}\|^2 - 2 \textbf{a} \cdot \textbf{b} \end{align} \]

which gives us:

\[ \begin{align} \textbf{a} \cdot \textbf{b} &= a_xb_x + a_yb_y \\ &= \frac{\|\textbf{a}\|^2 + \|\textbf{b}\|^2 - \|\textbf{a}-\textbf{b}\|^2 }{2} \end{align} \]

The dot product above is derived purely from the lengths. Since the lengths remain unchanged under rotations, the algebraic dot product remains unchanged as well, which completes our proof.

Note that we derived the law of cosine, which is quite cool!

General version

It’s not hard to extend the 2D version of the proof to n-dimensional vectors, since we didn’t rely on any properties unique to 2D.

I will provide more alternative proofs below, mostly for myself. They’re optional and collapsed, feel free to read them.

We only scaled the vector \(\textbf{b}\) but we can also scale the vector \(\textbf{a}\), so that \(\|\textbf{a}\|=1\). Also switch to the polar coordinates: \(\textbf{a}=[\cos \alpha, \sin \alpha]\) and \(\textbf{b}=[\cos \beta, \sin \beta]\).

Then, the geometric dot product is:

\[ \textbf{a} \cdot \textbf{b} = \cos(\alpha - \beta) \]

The algebraic is:

\[ \textbf{a} \cdot \textbf{b} = \cos \alpha \cos \beta + \sin \alpha \sin \beta \]

We arrive at the cosine subtraction rule: \[ \cos(\alpha-\beta)=\cos \alpha \cos \beta + \sin \alpha \sin \beta \]

We will extend the formula from the rotational invariance section further. We derived the following dot product formula:

\[ \begin{align} \textbf{a} \cdot \textbf{b} &= \sum_{i=1}^n a_ib_i \\ &= \frac{\|\textbf{a}\|^2 + \|\textbf{b}\|^2 - \|\textbf{a}-\textbf{b}\|^2 }{2} \end{align} \]

From the law of cosine, we also have:

\[ \|\textbf{a}-\textbf{b}\|=\|\textbf{a}\|^2+\|\textbf{b}\|^2-2\|\textbf{a}\|\|\textbf{b}\|\cos \theta \]

if we rearrange and combine the above two equations: \[ \begin{align} \|\textbf{a}\|\|\textbf{b}\|\cos \theta =& \textbf{a} \cdot \textbf{b}= \sum_{i=1}^n a_ib_i \\ =& \frac{\|\textbf{a}\|^2 + \|\textbf{b}\|^2 - \|\textbf{a}-\textbf{b}\|^2 }{2} \end{align} \]

We can use the Pythagorean theorem:

\[ \begin{align} \|\textbf{a}\|^2&=\|\textbf{p}\|^2 + \|\textbf{a}-\textbf{p}\|^2 \\ &= \|\textbf{p}\|^2 + (a_x-p_x)^2 + (a_y-p_y)^2 \\ &= \|\textbf{p}\|^2 + (a_x^2+a_y^2) + (p_x^2+p_y^2) - 2(a_xp_x + a_yp_y) \\ &= \|\textbf{p}\|^2 + \|\textbf{a}\|^2 + \|\textbf{p}\|^2 - 2(a_xp_x + a_yp_y) \\ &= \|\textbf{p}\|^2 + \|\textbf{a}\|^2 + \|\textbf{p}\|^2 - 2\textbf{a}\cdot\textbf{p} \\ \end{align} \]

This gives us the following:

\[ a_xp_x+a_yp_y=\textbf{a}\cdot\textbf{p}=\|\textbf{p}\|^2 \]

if we substitute \(\textbf{p}\) by \((\|\textbf{a}\|\cos\theta) \textbf{b}\):

\[ \|\textbf{a}\|\cos\theta(a_xb_x+a_yb_y)=\|\textbf{a}\|\cos\theta(\textbf{a}\cdot \textbf{b})=(\|\textbf{a}\|\cos\theta)^2 \]

which gives back the both geometric and algebraic dot product.

You can also find more proofs in proofwiki.

The End

I hope you enjoyed this post. You can ask further questions on my telegram channel