Linear Equation Intuition
Introduction
A linear equation is an equation in the form \(a_1x_1+\cdots+a_nx_n+b=0\). In two dimensions, it’s an equation of a line \(ax+by+c=0\), while in three dimensions, it’s an equation of a plane \(ax+by+cz+d=0\).
A linear equation is the foundation of linear models in machine learning, such as linear regression and logistic regression.
In this post, let’s understand the intuitions behind the linear equation, such as:
- Why \(ax+by+c\) is the distance between the point \((x, y)\) and the line?
- What are the meanings of \((a, b)\) and \(c\)?
- How are the dot product and linear equation related?
My previous post about the dot product is a required prerequisite to this post. You need to have a good intuition about the dot product.
Cheatsheet
A short summary for people in hurry.
The distance from the point (x, y) to the line is: \[ \left|\frac{ax+by+c}{\sqrt{a^2+b^2}}\right| \]
in n-dimensions:
\[ \left|\frac{a_1x_1+\cdots+a_nx_n+b}{\sqrt{a_1^1+\cdots+a_n^2}}\right| \]
The vector \(\textbf{n}=[a, b]\) is the orthogonal vector to the line \(ax+by+c=0\), and \(\textbf{n}=[a_1, a_2, \cdots, a_n]\) in n-dimensions.
The line equation can be expressed with the dot product as \(\textbf{n} \cdot {[x, y]}+c=0\) and \(\textbf{n} \cdot \textbf{x} + b = 0\) in n-dimensions.
Equation of a line
ax+by
Let’s start with a simpler equation of a line in the form of \(ax+by=0\), where we have \(c=0\). Does it remind you something ? Yes, the dot product between \((a, b)\) and \((x, y)\). The dot product \((a, b) \cdot (x, y)\) is zero when the two vectors are orthogonal. Oh nice, this means that (a, b) is an orthogonal vector to our line. An orthogonal vector is also known as a normal vector of a line. Also, notice that the simpler line contains the origin since \(a\cdot0+b\cdot0=0\) holds true.
Let’s visualize our setup:
We have the following items visualized:
- Our line \(ax+by=0\) is visualized with blue
- It passes through the origin
- A normal vector of the line \(\textbf{n}=[a, b]\) is visualized with red
- Note that multiple \(\textbf{n}\) vectors can represent the same line
Feel free to play (by moving) with the point \(\textbf{n}\) and notice how the line equation is changing with respect to the normal vector.
We can multiply or divide the line equation by any real number \(k\) without changing the equation, in other words, \(kax+kby+kc=0\) and \(ax+by+c=0\) hold true to the same set of points. Let’s use this property and divide the equation by \(\sqrt{a^2+b^2}\), since it gives us \(a^2+b^2=1\) and \(\|\textbf{n}\|=1\).
Explanation
We obtain a new normalized line equation with the coefficients: \[ \begin{align} a_{new}&=\frac{a}{\sqrt{a^2+b^2}}\\ b_{new}&=\frac{b}{\sqrt{a^2+b^2}}\\ c_{new}&=\frac{c}{\sqrt{a^2+b^2}} \end{align} \]
Note that \(a_{new}^2+b_{new}^2=1\) holds true.
From now on, we assume that \(\|\textbf{n}\|=1\). Let’s check what we have:
We introduced a slider that defines the angle of the normal vector \(\textbf{n}\). We also restricted \(\textbf{n}\) so that \(\|\textbf{n}\|=1\). Now, our line is uniquely defined by \(\textbf{n}\), which was not case before.
Let’s introduce a point \(\textbf{p}=[x, y]\) on the line:
Try moving the new green point! The point \(\textbf{p}=[x, y]\) always lies on the line, in other words:
- \(ax+by=0\), where \(\textbf{n}=[a, b]\)
- Equivalently \(\textbf{n} \cdot \textbf{p} = 0\)
- The angle defined by the points \((\textbf{n}, \text{origin}, \textbf{p})\) is a right angle.
We have learned so far that \(a\) and \(b\) represent the normal vector \(\textbf{n}=[a, b]\), and we can express the line equation with the dot product \(\textbf{n} \cdot \textbf{p} = 0\).
What is the meaning of the line equation when the point \(\textbf{p}\) is outside of the line, i.e., when \(ax+by \neq 0\)? Spoiler: the value is the signed distance from the point \(\textbf{p}\) to our line. Let’s understand this intuitively.
The dot product \(\textbf{n} \cdot \textbf{p}\) is a non-zero value when the point \(\textbf{p}\) is not on the line. From the previous post we know that the dot product is the projection of the vector \(\textbf{p}\) to the vector \(\textbf{n}\). Imagine a 1D number line spanned by the vector \(\textbf{n}\) like this:
The introduced 1D number line has ticks that represent the signed distances from our line. The projection of the point \(\textbf{p}\) falls somewhere on that 1D number line (following the dotted line segment), which corresponds to the dot product. Remember that \(ax+by\) and \(\textbf{n} \cdot \textbf{p}\) are the same. Play with the point \(\textbf{p}\) (and with the slider) to understand when \(ax+by\) is zero, positive and negative. Can you see that \(ax+by\) (and \(\textbf{n} \cdot \textbf{p}\)) represents the signed distance from the point \(\textbf{p}\) to our line?
ax+by+c
By this point, you need to understand the line equation of the form \(ax+by=0\). Now, let’s explore the line equation of the form \(ax+by+c=0\)!
We represented \(ax+by\) with the dot product \(\textbf{n} \cdot \textbf{p}\), similarly \(ax+by+c\) can be represented with \(\textbf{n} \cdot \textbf{p}+c\). First, we land on the 1D number line with \(\textbf{n} \cdot \textbf{p}\), then move \(c\) steps forward on that 1D number line. It is easier explained with a visualization:
Play with the slider for \(c\), and notice the location of our line when \(c=1\) and \(c=-1\).
Can you see that \(ax+by+c\) is still the signed distance from the point \(\textbf{p}\) to our line? What does \(c\) represent?
Explanation
First, we land on the 1D number line with \(c'=\textbf{n} \cdot \textbf{p}\). Then, let’s denote \(ax+by+c\) as \(c'+c\). The \(c'+c\) is zero when \(c'=-c\); \(1\) when \(c'=-c+1\); \(-1\) when \(c'=-c-1\); and so on.
\(|c|\) is the distance from the origin to our line, c is the signed distance. The signed distance is positive in the direction of the normal vector, and negative in the opposite direction.
The End
I hope you enjoyed this post. Subscribe to get a notification about future posts.