Covariance Visualization

We have a matrix $A$ with dimension as $m \times n$. Each row is a data sample with a feature vector $f \in \mathbb{R^n}$. We can get a covariance matrix which represents correlation among different features. $$ M_{n \times n} = \begin{bmatrix} (f_1 - \bar{f})^T, (f_2 - \bar{f})^T, \cdots \end{bmatrix}_{n \times m} \begin{bmatrix} f_1 - \bar{f} \\ f_2 - \bar{f} \\ \vdots \end{bmatrix}_{m \times n} $$

Assume data samples follow a Gaussian distribution as: $$ \begin{align} p(x) &\sim \mathcal{N}(0, M)\\ p(x) &= \frac{e^{-\frac{1}{2}x^TM^{-1}x}}{\sqrt{(2\pi)^n\|M\|}} \end{align} $$ With Eigen decomposition $M = Q\Sigma Q^T$, we can get $$ \begin{align} p(x) &= \frac{e^{-\frac{1}{2}x^T(Q \Sigma Q^T)^{-1}x}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ &= \frac{e^{-\frac{1}{2}x^T Q \Sigma^{-1} Q^Tx}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ \end{align} $$ Let $y = Q^Tx$, then: $$ \begin{align} p(x) &= \frac{e^{-\frac{1}{2}x^T Q \Sigma^{-1} Q^Tx}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ &= \frac{e^{-\frac{1}{2}y^T \Sigma^{-1} y}}{\sqrt{(2\pi)^n\|\Sigma\|}} \end{align} $$

$y^T\Sigma^{-1}y$ follows $\chi^2$ distribution. For example, if $y \in \mathbb{R}^2$, and $\Sigma$'s diagonal elements are $\sigma_1$ and $\sigma_2$. If we choose p-value as $0.05$ and degree of freedom as $2$, based on table , we will get $$ y^T\Sigma^{-1}y = 5.99 \\ \frac{y_1^2}{\sigma_1^2} + \frac{y_2^2}{\sigma_2^2} = 5.99 $$

The equation is just an ellipse! But wait, this is an equation for $y$ but our origin data are $x$, what should we do? Remember $y = Q^Tx$? Then we have $x = Qy$. We can get $y$'s frame axes in $x$'s frame are $q_1$ and $q_2$.

In [4]:
Rotation matrix that transform points in O to P
[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]
Rotation matrix that transform points in P to O
[[-0.70819367 -0.70601822]
 [ 0.70601822 -0.70819367]]

$\Sigma$ contains the variance information of $y$. If we want to reduce the feature dimension, we may pick first couple of dimensions with big $\sigma$. We still need to transform the data from $x$ to $y$ using $y=Q^Tx$. The matrix form will be $$ \tilde{A}^T = Q^TA^T \\ \tilde{A}^T \approx Q[:, 0:k]^TA^T \\ \tilde{A}_{(m\times k)} \approx AQ[:, 0:k] $$

Another way to view the dimension reduction probably is through this: $$ P(x) = \frac{e^{-\frac{1}{2}x^TQ\Sigma^{-1}Q^Tx}}{\sqrt{((2\pi)^k|\Sigma|}} \\ = \frac{e^{-\frac{1}{2}x^T[q_1, q_2, \cdots, q_n]\Sigma^{-1}[q_1, q_2, \cdots, q_n]^Tx}}{\sqrt{((2\pi)^k|\Sigma|}} $$

We can set $q_{k+1}, \cdots$ and $\sigma_{k+1}, \cdots$ to zeros (similar to using Singular Value Decomposition to approximate a matrix), then we can get the same results.

This blog is converted from covariance-visualization.ipynb
Written on January 2, 2022