Diffusion Super Resolution

Forward Process

Convert the data distribution to standard gaussian distribution.

q \left( \mathbf{x} \mid \mathbf{x}_{t-1} \right) = \mathcal{N} \left( \mathbf{x}_t ; \; \sqrt{1 - \beta_t} \; \mathbf{x}_{t-1}, \; \beta_t \, \mathbf{I} \right)

The covariance matrix $\mathbf{\Sigma} = \beta_t \cdot \mathbf{I}$ is diagonal. Implying that the noise added to each dimension is independent.
$\beta_t$ typically starts close to 0 for small $t$
1. The variance is small for early steps, gradually become bigger.
2. Since $\sqrt{1 - \beta_t}$ is close to $1$ , thus the mean is close to the previous step $\mathbf{x}_{t-1}$ in early stages.
3. Since the covariance matrix is the multiply of a scalar $\beta_t$ and an identity matrix $\mathbf{I}$ , the amount of noise added to each dimension is isotropic.

Goal for Forward Process

Theoretically, the distribution of the final step, $\mathbf{X}_T$ , must approximate the standard gaussian distribution. In the reverse process, the starting point would be a sample from the distribution of the last step of the forward process. Aligning the final step of the forward process with standard gaussian distribution makes it easy to draw a sample from the distribution of the final step.

q(\mathbf{x}_T) \approx p_{prior}(\mathbf{x}_T) = \mathcal{N}(\mathbf{x}_T; \, \mathbf{0}, \, \mathbf{I})

As $T \rightarrow \infty$ , the distribution $\mathbf{x}_T$ is close to standard Gaussian distribution

Markov Chain Forward Process

Given the forward process is Markovian as each step depends only on the immediately preceding sample, the entire forward process from the original data sample $\mathbf{x}_0$ to the final step $\mathbf{x}_T$ can be written as

\begin{align*}q(\mathbf{x}_{1:T} \mid \mathbf{x}_0)&= q(\mathbf{x}_1, \mathbf{x}_2, \dots \mathbf{x}_T \mid \mathbf{x}_0)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_{2:T} \mid \mathbf{x}_1)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_2 \mid \mathbf{x}_1) \, q(\mathbf{x}_{3:T} \mid \mathbf{x}_2)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_2 \mid \mathbf{x}_1) \dots q(\mathbf{x}_{T:T} \mid \mathbf{x}_{T-1})\\&= \prod_{t=1}^T q(\mathbf{x}_t \mid \mathbf{x}_{t-1})\end{align*}

Despite the forward process is a Markov Process, the distribution of any time step can still be calculated directly.