Covariance Visualization

We have a matrix A with dimension as m×n. Each row is a data sample with a feature vector fRn. We can get a covariance matrix which represents correlation among different features. Mn×n=[(f1ˉf)T,(f2ˉf)T,]n×m[f1ˉff2ˉf]m×n

Assume data samples follow a Gaussian distribution as: p(x)N(0,M)p(x)=e12xTM1x(2π)n With Eigen decomposition M = Q\Sigma Q^T, we can get \begin{align} p(x) &= \frac{e^{-\frac{1}{2}x^T(Q \Sigma Q^T)^{-1}x}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ &= \frac{e^{-\frac{1}{2}x^T Q \Sigma^{-1} Q^Tx}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ \end{align} Let y = Q^Tx, then: \begin{align} p(x) &= \frac{e^{-\frac{1}{2}x^T Q \Sigma^{-1} Q^Tx}}{\sqrt{(2\pi)^n\|\Sigma\|}} \\ &= \frac{e^{-\frac{1}{2}y^T \Sigma^{-1} y}}{\sqrt{(2\pi)^n\|\Sigma\|}} \end{align}

y^T\Sigma^{-1}y follows \chi^2 distribution. For example, if y \in \mathbb{R}^2, and \Sigma's diagonal elements are \sigma_1 and \sigma_2. If we choose p-value as 0.05 and degree of freedom as 2, based on table , we will get y^T\Sigma^{-1}y = 5.99 \\ \frac{y_1^2}{\sigma_1^2} + \frac{y_2^2}{\sigma_2^2} = 5.99

The equation is just an ellipse! But wait, this is an equation for y but our origin data are x, what should we do? Remember y = Q^Tx? Then we have x = Qy. We can get y's frame axes in x's frame are q_1 and q_2.

In [4]:
Rotation matrix that transform points in O to P
[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]
Rotation matrix that transform points in P to O
[[-0.70819367 -0.70601822]
 [ 0.70601822 -0.70819367]]

\Sigma contains the variance information of y. If we want to reduce the feature dimension, we may pick first couple of dimensions with big \sigma. We still need to transform the data from x to y using y=Q^Tx. The matrix form will be \tilde{A}^T = Q^TA^T \\ \tilde{A}^T \approx Q[:, 0:k]^TA^T \\ \tilde{A}_{(m\times k)} \approx AQ[:, 0:k]

Another way to view the dimension reduction probably is through this: P(x) = \frac{e^{-\frac{1}{2}x^TQ\Sigma^{-1}Q^Tx}}{\sqrt{((2\pi)^k|\Sigma|}} \\ = \frac{e^{-\frac{1}{2}x^T[q_1, q_2, \cdots, q_n]\Sigma^{-1}[q_1, q_2, \cdots, q_n]^Tx}}{\sqrt{((2\pi)^k|\Sigma|}}

We can set q_{k+1}, \cdots and \sigma_{k+1}, \cdots to zeros (similar to using Singular Value Decomposition to approximate a matrix), then we can get the same results.

This blog is converted from covariance-visualization.ipynb
Written on January 2, 2022