Convert the data distribution to standard gaussian distribution.
q(x∣xt−1)=N(xt;1−βtxt−1,βtI)
The covariance matrix Σ=βt⋅I is diagonal. Implying that the noise added to each dimension is independent.
βt typically starts close to 0 for small t
The variance is small for early steps, gradually become bigger.
Since 1−βt is close to 1, thus the mean is close to the previous step xt−1 in early stages.
Since the covariance matrix is the multiply of a scalar βt and an identity matrix I, the amount of noise added to each dimension is isotropic.
Goal for Forward Process
Theoretically, the distribution of the final step, XT, must approximate the standard gaussian distribution. In the reverse process, the starting point would be a sample from the distribution of the last step of the forward process. Aligning the final step of the forward process with standard gaussian distribution makes it easy to draw a sample from the distribution of the final step.
q(xT)≈pprior(xT)=N(xT;0,I)
As T→∞, the distribution xT is close to standard Gaussian distribution
Markov Chain Forward Process
Given the forward process is Markovian as each step depends only on the immediately preceding sample, the entire forward process from the original data sample x0 to the final step xT can be written as