Motivation ¶

Let's think a problem about fitting a 2D polynomial line: $$ \begin{bmatrix} 1, x_0, x_0^2, x_0^3, \cdots x_0^n \\ 1, x_1, x_1^2, x_1^3, \cdots x_1^n \\ \vdots \\ 1, x_m, x_m^2, x_m^3, \cdots x_m^n \\ \end{bmatrix} \begin{bmatrix} \beta_0 \\ \beta_1\\ \vdots \\ \beta_n\end{bmatrix} = \begin{bmatrix} y_0 \\ y_1\\ \vdots \\ y_m\end{bmatrix} $$

$(x_i, y_i)$ are observed points and $\beta_i$ are coefficients for the polynomial function. How do we solve this equation to find the best $\beta$?

Vectors ¶

Vector Representations ¶

A two-dimensional vector $v \in \mathbb{R}^2$ can be written as a column vector:

$$ v = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} $$

It represents:

direction and magnitude , e.g. the velocity of a point on a robot
position , e.g. the position of a point on a robot

In [139]:

# Vector representation

%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

p0 = np.array([1, 1]) 
v = np.array([2, 0]) # m/s
p1 = p0 + v

plt.plot(p0[0], p0[1], 'r*', markersize=10, label='Position t=0')
plt.plot(p1[0], p1[1], 'g*', markersize=10, label='Position t=1')

plt.arrow(p0[0], p0[1], v[0], v[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='b', label='velocity vector')

plt.xlim([0, 4])
plt.ylim([0, 2])
plt.legend()
plt.plot()

#

Out[139]:

[]

Vector Operations ¶

Scaling

$\lambda \textbf{v}$
Linear combinations

$\alpha\textbf{v} + \beta\textbf{w}$
Dot product

$ \begin{align} \textbf{v}.dot(\textbf{w}) =& \textbf{v}^T \textbf{w} \\ =& v_1w_1+\cdots+v_nw_n \end{align} \\ $ where $v,w \in \mathbb{R}^n$

$ |\textbf{v}.dot(\textbf{w})\| = \|\textbf{v}\|\|\textbf{w}\|cos(\theta) \\ $ where $\theta$ is the angle between two vectors.

Cross product of vectors in $\mathbb{R}^3$

$\textbf{v}.cross(\textbf{w}) = \begin{bmatrix} 0, -v_3, v_2\\ v_3, 0, -v_1 \\ -v_2, v_1, 0 \end{bmatrix} \begin{bmatrix} w_1 \\ w_2 \\ w_3 \\ \end{bmatrix} \\ $ $\|\textbf{v}.cross(\textbf{w})\| = \|\textbf{v}\|\|\textbf{w}\|sin(\theta)$, where $\theta$ is the angle between two vectors.

Examples of Vectors Operations in Robotics ¶

Scaling

Starting at $O$, robot moves at a constant speed $v$. After $\Delta t$, the robot will be at $D = O+v\Delta t$

Linear combinations

Starting at $O \in \mathbb{R}^2$, robot moves at a constant speed $\dot{x}=v$ in $x$ direction and $\dot{y}=w$ in $y$ direction. After $\Delta t$, the robot will be at $D = O+v\Delta t+w\Delta t$

In [42]:

# Vector Scaling

%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

p0 = np.array([1, 1]) 
delta_t = 1
v = np.array([2, 0]) # m/s
w = np.array([0, 1])
v_plus_w =  v*delta_t+ w*delta_t
p1 = p0 + v_plus_w


plt.plot(p0[0], p0[1], 'r*', markersize=10, label='Position O at t=0')
plt.plot(p1[0], p1[1], 'g*', markersize=10, label='Position D at t=1')

plt.arrow(p0[0], p0[1], v[0], v[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='g', label='velocity vector in x')
plt.arrow(p0[0], p0[1], w[0], w[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='y', label='velocity vector in y ')
plt.arrow(p0[0], p0[1], v_plus_w[0], v_plus_w[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='b', label='total velocity')

plt.xlim([0, 4])
plt.ylim([0, 4])
plt.legend()
plt.plot()

#

Out[42]:

[]

Dot product
1. Find two lines seen from the robot are parallel or not.
2. Check whether a point is inside a rectangle or not.

In [66]:

# 1. Find two lines seen from the robot are parallel or not.

%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

# Line representation is l_O + l_lambda * l_dir
l1_O = np.array([1, 1]) 
l1_dir = np.array([1, 2])
l1_dir = l1_dir / np.linalg.norm(l1_dir)
l1_lambda = 1.2

l2_O = np.array([3, 1])
l2_dir = np.array([1, 2])
l2_dir = l2_dir / np.linalg.norm(l2_dir)
l2_lambda = 0.8

def is_parallel(l1_dir, l2_dir):
    ret = np.dot(l1_dir, l2_dir) / (np.linalg.norm(l1_dir) * np.linalg.norm(l2_dir))
    return np.isclose(ret, 1, rtol=0, atol=1e-05)
    

if is_parallel(l1_dir, l2_dir):
    print('They are parallel :)')
else:
    print('They are not parallel :(')

plt.plot(l1_O[0], l1_O[1], 'r*', markersize=10)
plt.plot(l2_O[0], l2_O[1], 'r*', markersize=10)

plt.arrow(l1_O[0], l1_O[1], l1_lambda*l1_dir[0], l1_lambda*l1_dir[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='g', label='line 1')
plt.arrow(l2_O[0], l2_O[1], l2_lambda*l2_dir[0], l2_lambda*l2_dir[1], width=0.01, head_width = 0.08, 
          shape = 'full', color='b', label='line 2 ')

plt.xlim([0, 4])
plt.ylim([0, 4])
plt.legend()
plt.plot()

#

They are parallel :)

Out[66]:

[]

In [79]:

# 1. Check whether a point is inside a rectangle or not.

%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

# Line representation is l_O + l_lambda * l_dir

lb_O = np.array([1, 1])
lb_to_t = np.array([0, 1])
lb_to_t = lb_to_t / np.linalg.norm(lb_to_t)
lb_to_t_lambda = 2
lb_to_r = np.array([1, 0])
lb_to_r = lb_to_r / np.linalg.norm(lb_to_r)
lb_to_r_lambda = 2


lt_O = lb_O + lb_to_t_lambda * lb_to_t
rb_O = lb_O + lb_to_r_lambda * lb_to_r
rt_O = lb_O + lb_to_t_lambda * lb_to_t + lb_to_r_lambda * lb_to_r

pt1 = np.array([2, 2])
pt2 = np.array([4, 2])

def is_inside(pt, lb_O, lt_O, rb_O, rt_O):
    if np.dot(pt - lb_O, lt_O - lb_O) < 0:
        return False
    if np.dot(pt - lb_O, rb_O - lb_O) < 0:
        return False
    if np.dot(pt - rt_O, lt_O - rt_O) < 0:
        return False
    if np.dot(pt - rt_O, rb_O - rt_O) < 0:
        return False
    return True

if is_inside(pt1, lb_O, lt_O, rb_O, rt_O):
    print('pt1 is inside!')
else:
    print('pt1 is outside!')
    
if is_inside(pt2, lb_O, lt_O, rb_O, rt_O):
    print('pt2 is inside!')
else:
    print('pt2 is outside!')

plt.plot(pt1[0], pt1[1], 'r*', label='pt1')
plt.plot(pt2[0], pt2[1], 'g*', label='pt2')

plt.plot([lb_O[0], lt_O[0], rt_O[0], rb_O[0], lb_O[0]], 
         [lb_O[1], lt_O[1], rt_O[1], rb_O[1], lb_O[1]], color='b', label='Rectangle')


plt.legend()
plt.plot()

#

pt1 is inside!
pt2 is outside!

Out[79]:

[]

Cross product
1. Linear velocity and angular velocity
2. Find the shortest distance from a point to a line.

In [107]:

# 2. Find the shortest distance from a point to a line.

%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

# Line representation is l_O + l_lambda * l_dir
lO = np.array([1, 1]) 
l_dir = np.array([1, 0])
l_dir = l_dir / np.linalg.norm(l_dir)
l_lambda = 4

pt =  np.array([3, 3]) 

def pt_to_line_distance(pt, lO, l_dir):
    return np.linalg.norm(np.cross(pt - lO, l_dir)) / np.linalg.norm(l_dir)
    
def pt_to_segment_distance(pt, lO, l_dir, l_lambda):
    d1 = pt_to_line_distance(pt, lO, l_dir)
    d2 = np.linalg.norm(pt-lO)
    d3 = np.linalg.norm(pt-(lO+l_lambda*l_dir))
    return min(min(d1, d2), d3)

pt_to_line = pt_to_line_distance(pt, lO, l_dir)
pt_to_segment = pt_to_segment_distance(pt, lO, l_dir, l_lambda)

print(pt_to_line)

plt.plot(pt[0], pt[1], 'r*', markersize=10, label='point')
plt.plot([lO[0], lO[0]+l_lambda*l_dir[0]], [lO[1], lO[1]+l_lambda*l_dir[1]], color='g', label='line')

norm_dir = np.array([-l_dir[1], l_dir[0]])

print(np.dot(norm_dir, l_dir))

plt.plot([pt[0], pt[0]+pt_to_line*norm_dir[0]], [pt[1], pt[1]-pt_to_line*norm_dir[1]], color='b', label='pt to line')

plt.legend()
plt.plot()

#

2.0
0.0

Out[107]:

[]

Matrices and Linear Equations ¶

Matrix Representations ¶

A $2 \times 2$ matrix can be written as a collection of two column vector $\textbf{v} \in \mathbb{R}^2$, $\textbf{w} \in \mathbb{R}^2$:

$$ \begin{bmatrix} a_{11}, a_{12} \\ a_{21}, a_{22} \end{bmatrix} = \begin{bmatrix} \textbf{v}, \textbf{w} \end{bmatrix} $$

Similarly, a $3 \times 3$ matrix can be written as a collection of three column vector $\textbf{u} \in \mathbb{R}^3$, $\textbf{v} \in \mathbb{R}^3$, $\textbf{w} \in \mathbb{R}^3$:

$$ \begin{bmatrix} a_{11}, a_{12}, a_{13} \\ a_{21}, a_{22}, a_{23} \\ a_{31}, a_{32}, a_{33} \end{bmatrix} = \begin{bmatrix} \textbf{u}, \textbf{v}, \textbf{w} \end{bmatrix} $$

In this post, we will only discuss about $m \times m$ matrix. Without considering numerical issues, knowing properties of $m \times m$ is enough for us solving all kinds of linear equations.

Matrix Operations ¶

Matrix multiplication

$C_{mm} = A_{mn}B_{nm}$
Matrix times vector as a linear combination of all column vectors

$$ A\textbf{x} = \begin{bmatrix} a_{11}, a_{12}, a_{13} \\ a_{21}, a_{22}, a_{23} \\ a_{31}, a_{32}, a_{33} \end{bmatrix} \begin{bmatrix} {x_1}\\ {x_2}\\ {x_3} \end{bmatrix} = \begin{bmatrix} \textbf{u}, \textbf{v}, \textbf{w} \end{bmatrix} \begin{bmatrix} {x_1}\\ {x_2}\\ {x_3}\\ \end{bmatrix} = x_1\textbf{u} + x_2\textbf{v}, x_3\textbf{w} $$
Matrix inverse

$A_{mm}A_{mm}^{-1} = I$, for orthogonal matrix , $Q^{-1} = Q^{T}$

In [116]:

# Matrix Operations

import numpy as np

A = np.array([[1, 2, 3], [3, 2, 1], [1, 2, 1]])
B = np.array([[1, 2], [3, 2], [1, 1]])
x = np.array([1, 2, 3])
C = np.matmul(A, B)
Ax = A.dot(x)
A_inv = np.linalg.inv(A)

Q = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]])
Q_inv = np.linalg.inv(Q)
Q_t = Q.T
assert(np.allclose(Q_t, Q_inv))

#

Matrix Properties ¶

Linear independence

$\textbf{u}, \textbf{v}, \textbf{w}$ are linear independent if and only if there doesn't exist non-zero $\textbf{x}$ such that:

$$ \begin{bmatrix} \textbf{u}, \textbf{v}, \textbf{w} \end{bmatrix} \begin{bmatrix} {x_1}\\ {x_2}\\ {x_3}\\ \end{bmatrix} = x_1\textbf{u} + x_2\textbf{v}, x_3\textbf{w} = \textbf{0} $$

Determinant and rank

For matrix $A_{mm} = [\textbf{v}_1, \cdots, \textbf{v}_m]$,

$A^{-1}\ exists \iff rank(A) = m \iff det(A) \neq 0 \iff \textbf{v}_1, \cdots, \textbf{v}_m\ are\ linear\ independent$ $rank(A) < m \iff det(A) = 0 \iff \textbf{v}_1, \cdots, \textbf{v}_m\ are \ linear\ dependent$

In [121]:

# Matrix Rank

import numpy as np

A = np.array([[1, 2, 3], [3, 2, 1], [1, 2, 1]])
print(f'matrix A: \n {A}')
print(f'Rank: {np.linalg.matrix_rank(A)}')
print(f'Determinant: {np.linalg.det(A):.2f}')

#

matrix A: 
 [[1 2 3]
 [3 2 1]
 [1 2 1]]
Rank: 3
Determinant: 8.00

Geometry Interpretations

All possible combinations of two independent vectors $\textbf{v}$ and $\textbf{w}$ in $\mathbb{R}^2$ form a two dimensional space. In other words, any vectors in $\mathbb{R}^2$ can be represented as a combination of $\textbf{v}$ and $\textbf{w}$. For example, possible orthogonal bases of $\mathbb{R}^2$ are:

$$ \textbf{v} = \begin{bmatrix}1\\0 \end{bmatrix}, \textbf{w} = \begin{bmatrix}0\\1 \end{bmatrix} $$

Then $A_{22} = [\textbf{v}, \textbf{w}]$ will be a full rank matrix.

Suppose we have three vectors $\textbf{u} \in \mathbb{R}^3 $, $\textbf{v} \in \mathbb{R}^3$ and $\textbf{w} \in \mathbb{R}^3$ as below, because of $\textbf{w} = \textbf{u} + \textbf{v}$, $A_{33} = [\textbf{u}, \textbf{v}, \textbf{w}]$ will not be a full rank matrix. Its rank is $2$ which is equal to the number of linear independent column vectors. We can easily see in this example that $\textbf{w}$ is redundant. All vectors in $x-y$ plane in a cartesian coordinate can be represented by $\textbf{u}$ and $\textbf{w}$.

$$ \textbf{u} = \begin{bmatrix}1\\0\\0 \end{bmatrix}, \textbf{v} = \begin{bmatrix}0\\1\\0 \end{bmatrix}, \textbf{w} = \begin{bmatrix}1\\1\\0\end{bmatrix} $$

Four spaces

For a matrix $A_{m \times n}$, The row space and column space have dimension $r$ (the rank). The nullspaces of $A$ and $A^T$ have dimensions $n − r$ and $m − r$.

Four spaces

Linear Equations in Matrix Form ¶

Linear equations can be written in matrix form $A\textbf{x}=\textbf{b}$.

$$ a_{11}x_1 + a_{12}x_2 + a_{13}x_3 = b_1\\ a_{21}x_1 + a_{22}x_2 + a_{23}x_3 = b_2\\ a_{31}x_1 + a_{32}x_2 + a_{33}x_3 = b_3 $$$$ \begin{bmatrix} a_{11}, a_{12}, a_{13} \\ a_{21}, a_{22}, a_{23} \\ a_{31}, a_{32}, a_{33} \end{bmatrix} \begin{bmatrix} {x_1}\\ {x_2}\\ {x_3} \end{bmatrix}= \begin{bmatrix} {b_1}\\ {b_2}\\ {b_3} \end{bmatrix} $$

Solve $Ax=b$ ¶

How many solutions? ¶

$$ A_{mm}\textbf{x} = \textbf{b} $$

If $A_{mm}$ is a full rank matrix (all columns are linear independent, $det(A_{mm}) \neq 0$), then it will have a single unique solution. Since the column vectors form the bases of $\mathbb{R}^m$, so we can always find a combinations of them to be equal to $\textbf{b}$. For example, the column vectors of an identity matrix form the orthogonal bases. You can always find the unique solution for the linear equations below as $\textbf{x} = [b_1, b_2, b_3]^T$
$$ \begin{bmatrix} 1, 0, 0\\ 0, 1, 0\\ 0, 0, 1 \end{bmatrix} \textbf{x} = \begin{bmatrix} {b_1}\\ {b_2}\\ {b_3} \end{bmatrix} $$

If $A_{mm}$ is not a full rank matrix, then it will have either no solution or infinite solutions . If vector $\textbf{b}$ is in the matrix $A_{mm}$'s column space, then it will have infinite solutions. As in the example shown below, $x_3$ can be any value as long as $x_1 = 1, x_2 = 1$. However, if $b = [0,0,1]^T$, there won't be any solutions.
$$ \begin{bmatrix} 1, 0, 1\\ 0, 1, 1\\ 0, 0, 0 \end{bmatrix} \textbf{x} = \begin{bmatrix} {2}\\ {2}\\ {0} \end{bmatrix} $$

Pseudo Inverse ¶

What could we do if there is no solutions to linear equations? Actually no matter whether there are solutions or not, we can formulate it as an minimization problem:

$$\textbf{x} = \underset{x\in \mathbb{R}^n}{\operatorname{argmin}}f(\textbf{x}) = \underset{x\in \mathbb{R}^n}{\operatorname{argmin}}\|A_{mn}\textbf{x} - \textbf{b}\|^2$$

By setting $\frac{\partial f(\textbf{x})}{\partial \textbf{x}} = 0$, we can get:

$$ A_{mn}^T A_{mn} \textbf{x} = A_{mn}^T \textbf{b} \\ A'_{nn}\textbf{x} = \textbf{b}',\ {A'_{nn}}^T = A'_{nn}$$

As you can see, we get $A'_{nn}$as a symmetric square matrix! This is why we only discussed square matrices in this post.

If $A'_{nn}$ is a full rank matrix, we can get a unique solution:

$$\textbf{x} = {A'_{nn}}^{-1} \textbf{b}' = {(A_{mn}^T A_{mn})}^{-1} A_{mn}^T \textbf{b} $$

In [123]:

# Matrix Rank

import numpy as np

A = np.array([[1, 2, 3], [3, 2, 1], [1, 2, 1]])
b = np.array([2, 3, 4])
x = np.linalg.inv(A.T.dot(A)).dot(A.T).dot(b)

#

Cholesky Decomposition ¶

Cholesky decomposition $A_{mm} = L_{mm}L_{mm}^T$ ($L_{mm}$ is a lower triangular matrix) is a special version of LU decomposition , which can only be applied on symmetric matrices. Solving $A\textbf{x} = \textbf{b}$ is equal to solving $L\textbf{y} = \textbf{b}$ and $L^T\textbf{x} = \textbf{y}$.

Given a triangular matrix equations, we can solve it using forward or backward substitution.

$$ \begin{bmatrix} a_{11}, 0_{12}, 0_{13} \\ a_{21}, a_{22}, 0_{23} \\ a_{31}, a_{32}, a_{33} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} $$

In [130]:

# Matrix Rank

import numpy as np
from scipy.linalg import cho_factor, cho_solve

A = np.array([[1, 2, 3], [3, 2, 1], [1, 2, 1]])
b = np.array([2, 3, 4])
L = np.linalg.cholesky(A.T.dot(A))
x0 = np.linalg.inv(A.T.dot(A)).dot(A.T).dot(b)
x1 = cho_solve((L, True), A.T.dot(b))

assert(np.allclose(x0, x1))

#

Singular Value Decomposition (SVD) ¶

What could we do if $A_{mn}^T A_{mn}$ is not a full rank matrix (when $rank(A_{mn}) < n$) ?

First method: Find all independent columns, and solve the new linear equations only contain these independent columns. The $\textbf{x}$ variable corresponding to the rest of dependent columns will be set as 0. I don't think you will find this answer on any textbook, but I feel this is the most intuitive method. All dependent columns are redundant since they can be expressed as combinations of other independent columns. Thus, we don't really need them.
Second method: $\textbf{x} = VZU^T\textbf{b}$, where $A = U\Sigma V^T$,

$$ z_i=\begin{cases} \frac{1}{\sigma_i}, & \text{if $\sigma_i \neq 0$}.\\ 0, & \text{otherwise}. \end{cases} $$

In [140]:

# Use SVD to solve linear equations

import numpy as np
import math

A = np.array([[1,2,0,0],
              [0,0,0,0],
              [0,0,2,0],
              [0,0,0,1],
              [0,0,0,0]])
print(f'A:\n {A}')
b = np.array([0, 0, 4, 1, 0])
print(f'b:\n {b}')

AtA = np.transpose(A).dot(A)
print(f'AtA:\n {AtA}')
print(f'AtA rank: {np.linalg.matrix_rank(AtA)}')
# INF = np.linalg.inv(AtA)

# U * S * Vh * x = b
u, s, vh = np.linalg.svd(A)
#print(u)
print('singular, ', s)
#print(vh)
#print(u[:,0].dot(b) / s[0] * np.transpose(vh[0, :]))
#print(u[:,1].dot(b) / s[1] * np.transpose(vh[1, :]))
#print(u[:,2].dot(b) / s[2] * np.transpose(vh[2, :]))
W = np.diag(s[s > 0])
[k, k] = W.shape
V = np.transpose(vh)
x = V[:, 0:k].dot(np.linalg.inv(W)).dot(np.transpose(u[:, 0:k])).dot(b) 
print(f'x: {x}')

# Set x2 = 0 since the second column is not independent
a = A[:, [0, 2, 3]]
AtA = np.transpose(a).dot(a)
INF = np.linalg.inv(AtA)
print(f'x: {INF.dot(np.transpose(a).dot(np.transpose(b)))}') 
#

A:
 [[1 2 0 0]
 [0 0 0 0]
 [0 0 2 0]
 [0 0 0 1]
 [0 0 0 0]]
b:
 [0 0 4 1 0]
AtA:
 [[1 2 0 0]
 [2 4 0 0]
 [0 0 4 0]
 [0 0 0 1]]
AtA rank: 3
singular,  [2.23606798 2.         1.         0.        ]
x: [0. 0. 2. 1.]
x: [0. 2. 1.]

Xipeng Wang

Autonomous Robots - Linear Algebra