logoHome
Diffusion super resolution cover

Diffusion Super Resolution

Math Details

Forward Process

Convert the data distribution to standard gaussian distribution.

q(xxt1)=N(xt;  1βt  xt1,  βtI)q \left( \mathbf{x} \mid \mathbf{x}_{t-1} \right) = \mathcal{N} \left( \mathbf{x}_t ; \; \sqrt{1 - \beta_t} \; \mathbf{x}_{t-1}, \; \beta_t \, \mathbf{I} \right)
  1. The covariance matrix Σ=βtI\mathbf{\Sigma} = \beta_t \cdot \mathbf{I} is diagonal. Implying that the noise added to each dimension is independent.
  2. βt\beta_t typically starts close to 0 for small tt
    1. The variance is small for early steps, gradually become bigger.
    2. Since 1βt\sqrt{1 - \beta_t} is close to 11, thus the mean is close to the previous step xt1\mathbf{x}_{t-1} in early stages.
    3. Since the covariance matrix is the multiply of a scalar βt\beta_t and an identity matrix I\mathbf{I}, the amount of noise added to each dimension is isotropic.

Goal for Forward Process

Theoretically, the distribution of the final step, XT\mathbf{X}_T, must approximate the standard gaussian distribution. In the reverse process, the starting point would be a sample from the distribution of the last step of the forward process. Aligning the final step of the forward process with standard gaussian distribution makes it easy to draw a sample from the distribution of the final step.

q(xT)pprior(xT)=N(xT;0,I)q(\mathbf{x}_T) \approx p_{prior}(\mathbf{x}_T) = \mathcal{N}(\mathbf{x}_T; \, \mathbf{0}, \, \mathbf{I})

As TT \rightarrow \infty, the distribution xT\mathbf{x}_T is close to standard Gaussian distribution

Markov Chain Forward Process

Given the forward process is Markovian as each step depends only on the immediately preceding sample, the entire forward process from the original data sample x0\mathbf{x}_0 to the final step xT\mathbf{x}_T can be written as

q(x1:Tx0)=q(x1,x2,xTx0)=q(x1x0)q(x2:Tx1)=q(x1x0)q(x2x1)q(x3:Tx2)=q(x1x0)q(x2x1)q(xT:TxT1)=t=1Tq(xtxt1)\begin{align*}q(\mathbf{x}_{1:T} \mid \mathbf{x}_0)&= q(\mathbf{x}_1, \mathbf{x}_2, \dots \mathbf{x}_T \mid \mathbf{x}_0)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_{2:T} \mid \mathbf{x}_1)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_2 \mid \mathbf{x}_1) \, q(\mathbf{x}_{3:T} \mid \mathbf{x}_2)\\&= q(\mathbf{x}_1 \mid \mathbf{x}_0) \, q(\mathbf{x}_2 \mid \mathbf{x}_1) \dots q(\mathbf{x}_{T:T} \mid \mathbf{x}_{T-1})\\&= \prod_{t=1}^T q(\mathbf{x}_t \mid \mathbf{x}_{t-1})\end{align*}

Despite the forward process is a Markov Process, the distribution of any time step can still be calculated directly.