The Multivariate Distribution

The multivariate distribution of a random variable $x \in \mathbb{R}^n$ is written as $x \sim \mathcal{N}(\mu, \Sigma)$ where $\mu \in \mathbb{R}^n$ is the mean vector and $\Sigma \in \mathbb{R}^{n \times n}$ is the covariance matrix. $\Sigma\geqslant 0$ is symmetric and positive semi-definite. Note that the multivariate distribution generalizes the one-dimensional normal distribution to n-dimensions. The probability density function is given by:

$$ p(x) = \frac{1}{(2\pi)^{n/2}\vert\Sigma\vert^{1/2}}exp(-\frac{1}{2}(x-\mu)^T\Sigma^{-1} (x-\mu)) $$

For this multivariate distribution the expectation is given by:

$$ \mathbb{E}[x] = \mu $$

Also, the covariance matrix, which generalizes the notion of variance, is given by:

$$ \begin{eqnarray} \Sigma &=& \mathbb{E}[(x-\mathbb{E}[x])(x-\mathbb{E}[x])^T] \\ &=& \mathbb{E}[xx^T-x\mathbb{E}[x^T]-\mathbb{E}[x]x^T+\mathbb{E}[x]\mathbb{E}[x]^T]\\ &=& \mathbb{E}[xx^T]-\mathbb{E}[x]\mathbb{E}[x^T]-\mathbb{E}[x]\mathbb{E}[x^T]+\mathbb{E}[x]\mathbb{E}[x^T]\mathbb{E}[1]\\ &=& \mathbb{E}[xx^T]-\mathbb{E}[x]\mathbb{E}[x^T] \end{eqnarray} $$

A multivariate distribution with a zero mean vector and an identity matrix as the covariance matrix is known as the standard normal distribution.

Marginals and Conditionals of Gaussians

Suppose that we have a vector-valued random variable $x\sim \mathcal{N}(\mu,\Sigma)$ where:

$$ x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\ \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}\\ \Sigma = \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix}\\ $$

Here $x_1, \mu_1 \in \mathcal{R}^{n_1}$, $x_2, \mu_2 \in \mathcal{R}^{n_2}$, $x, \mu \in \mathcal{R}^{n_1+n_2}$, $\Sigma_{11} \in \mathcal{R}^{n_1 \times n_1}$, $\Sigma_{12} \in \mathcal{R}^{n_1 \times n_2}$, $\Sigma_{21} \in \mathcal{R}^{n_2 \times n_1}$, $\Sigma_{22} \in \mathcal{R}^{n_2 \times n_2}$ and $\Sigma \in \mathcal{R}^{(n_1+n_2) \times (n_1+n_2)}$.

Note that $\mathbb{E}(x_1)=\mu_1$and $\mathbb{E}(x_2)=\mu_2$. Also, note that:

$$ \begin{eqnarray} Cov(x) &=& \Sigma\\ &=& \mathbb{E}[(x-\mu)(x-\mu^T)]\\ &=& \mathbb{E}\left[\begin{bmatrix} x_1 - \mu_1 \\ x_2 - \mu_2\end{bmatrix}\begin{bmatrix} x_1 - \mu_1 \\ x_2 - \mu_2\end{bmatrix}^T\right]\\ &=& \mathbb{E}\left[\begin{bmatrix} x_1 - \mu_1 \\ x_2 - \mu_2\end{bmatrix}\begin{bmatrix} (x_1 - \mu_1)^T & (x_2 - \mu_2)^T\end{bmatrix}\right]\\ &=& \mathbb{E}\left[\begin{bmatrix} (x_1 - \mu_1)(x_1 - \mu_1)^T & (x_1 - \mu_1)(x_2 - \mu_2)^T \\ (x_2 - \mu_2)(x_1 - \mu_1)^T & (x_2 - \mu_2)(x_2 - \mu_2)^T\end{bmatrix}\right]\\ &=& \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} \end{eqnarray} $$

Note that $Cov(x_1) = (x_1 - \mu_1)(x_1 - \mu_1)^T = \Sigma_{11}$ and similarly $Cov(x_2) = \Sigma_{22}$.

Note that $x$ represents the joint multivariate density of $x_1$ and $x_2$. We have also found the marginal distributions of $x_1$ and $x_2$ to be $\mathcal{N}(\mu_1,\Sigma_{11})$ and $\mathcal{N}(\mu_2,\Sigma_{22})$ respectively.

It may also be shown that the conditional distribution $x_1 \vert x_2$ is given by $\mathcal{N}(\mu_{1 \vert 2},\Sigma_{1 \vert 2})$, where:

$$ \begin{eqnarray} \mu_{1 \vert 2} &=& \mu_1 + \Sigma_{12}\Sigma_{22}^{-1}(x_2 - \mu_2)\\ \Sigma_{1 \vert 2} &=& \Sigma_{11} - \Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21} \end{eqnarray} $$