Mutual Information

Entropy

$$ H(x) = -\sum_i{p(x_i)log(p(x_i))} $$

Conditional Entropy

$$ \begin{align} H(x|y) =& -\sum_{i,j}{p(x_i, y_j)log{p(x_i|y_j)}} \\ =& -\sum_{i,j}{p(x_i, y_j)log{\frac{p(x_i,y_j)}{p(y_j)}}} \end{align} $$

Joint Entropy

$$ H(x, y) = -\sum_{i,j}{p(x_i, y_j)log(p(x_i, y_j))} $$

Mutual Information

$$ MI(x, y) = -\sum_{i,j}{p(x_i, y_j)log(\frac{p(x_i)p(y_j)}{p(x_i, y_j)})} $$

Cross Entropy

$$ H(p, q) = -\sum{p(x)log(q(x))} $$

Kullback–Leibler divergence

$p$ is the true distribution while $q$ is the approximation. $$ D_{KL}(p \| q) = -\sum p(x) log(\frac{q(x)}{p(x)}) $$

Connections

$$ H(X,Y) = H(X) + H(Y|X) = H(Y) + H(X|Y) $$$$ MI(X,Y) = H(X) + H(Y) - H(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) $$$$ MI(X,Y) = D_{KL}(p(x,y) \| p(x)p(y)) $$$$ H(p, q) = H(p) + D_{KL}(p \| q) $$
In [4]:
This blog is converted from mutual-information.ipynb
Written on June 3, 2021