Skip to content

refactor: unify Euler, Euler Ancestral and DDIM implementations#1474

Open
wbruna wants to merge 7 commits intoleejet:masterfrom
wbruna:sd_unify_euler
Open

refactor: unify Euler, Euler Ancestral and DDIM implementations#1474
wbruna wants to merge 7 commits intoleejet:masterfrom
wbruna:sd_unify_euler

Conversation

@wbruna
Copy link
Copy Markdown
Contributor

@wbruna wbruna commented May 5, 2026

This started as an attempt to simplify the DDIM sampler, and ended up removing it entirely 🙂 It turns out it is equivalent to Euler Ancestral. I've kept the algebraic demonstrations on each commit message.

I've also joined the original Euler Ancestral with the flow variant, with the helper function from #1436 . The same approach could probably be used for other ancestral implementations.

The Euler merge is a little less clear-cut, since it loses its original simplicity, but I believe the unified code path to be overall simpler to follow and maintain.

wbruna added 7 commits May 5, 2026 18:33
The sigma_to == 0 simplification is:

d = (x - denoised) / sigma
x = x + d * (sigma_to - sigma)
  = x + (x - denoised) / sigma * (0 - sigma)
  = x + (x - denoised) * -1
  = denoised

For eta == 0, sigma_down = sigma_to, and sigma_up = 0. The
non-flow case is straightforward:

x = x + d * (sigma_down - sigma)
  = x + d * (sigma_to - sigma)

The flow case:

sigma_ratio = sigma_down / sigma = sigma_to / sigma
x = sigma_ratio * x + (1 - sigma_ratio) * denoised
  =     x * sigma_ratio          + denoised * (1 - sigma_ratio)
  =     x * sigma_to / sigma     - denoised * (sigma_to / sigma + 1)
  = x + x * sigma_to / sigma - x - denoised * sigma_to / sigma + denoised
  = x + (x - denoised) * (sigma_to / sigma - 1)
  = x + (x - denoised) / sigma * (sigma_to - sigma)
  = x + d * (sigma_to - sigma)
Euler Ancestral does:

d = (x - denoised) / sigma
x = x + d * (sigma_down - sigma)
  = x + (x - denoised) / sigma * (sigma_down - sigma)
  = x + (x - denoised) * (sigma_down / sigma - 1)
  = x + (x - denoised) * (sigma_ratio - 1)
  = x + x * sigma_ratio - x - denoised * sigma_ratio + denoised
  = x * sigma_ratio + denoised * (1 - sigma_ratio)

The ancestral noise is also identical, except for the alpha_scale.
I've kept the explicit test just to avoid an unnecessary tensor
multiplication.

Also, use the same calculation for the deterministic Euler
implementation: it has one less tensor operation, and slightly
better numerical stability.
We have:
  model_output = (x - denoised) / sigma = d
  alpha_prod_t = 1 / (sigma² + 1)
  beta_prod_t = 1 - alpha_prod_t = sigma² / (sigma² + 1)

Substitute alpha_prod_t:
  sqrt(1 / alpha_prod_t) = sqrt(sigma² + 1)
  sqrt(beta_prod_t) = sqrt(sigma² / (sigma² + 1)) = sigma / sqrt(sigma² + 1)

Then:
pred_original_sample
  = (x / sqrt(sigma² + 1) - sqrt(beta_prod_t) * d) * (1 / sqrt(alpha_prod_t))
  = (x / sqrt(sigma² + 1) - (sigma / sqrt(sigma² + 1)) * d) * sqrt(sigma² + 1)
  = x - sigma * d
  = x - sigma * ((x - denoised) / sigma)
  = x - (x - denoised)
  = denoised
When eta = 0, std_dev_t = 0. The sqrt term becomes:

sqrt((1 - alpha_prod_t_prev - std_dev_t^2) / alpha_prod_t_prev)
  = sqrt((1 - alpha_prod_t_prev) / alpha_prod_t_prev)
  = sqrt(beta_prod_t_prev / alpha_prod_t_prev)

Given:
alpha_prod_t      = 1 / (sigma^2 + 1)
beta_prod_t       = sigma^2 / (sigma^2 + 1)
alpha_prod_t_prev = 1 / (sigma_to^2 + 1)
beta_prod_t_prev  = sigma_to^2 / (sigma_to^2 + 1)

sqrt(beta_prod_t_prev / alpha_prod_t_prev)
  = sqrt((sigma_to^2 / (sigma_to^2 + 1)) / (1 / (sigma_to^2 + 1)))
  = sqrt(sigma_to^2)
  = sigma_to

So the deterministic step becomes:

x = denoised + sigma_to * model_output
  = denoised + sigma_to * (x - denoised) / sigma
  = denoised + (x - denoised) * sigma_to / sigma
  = denoised + x * sigma_to / sigma - denoised * sigma_to / sigma
  = denoised * (1 - sigma_to / sigma) + x * sigma_to / sigma
  = x + denoised * (1 - sigma_to / sigma) + x * sigma_to / sigma - x
  = x + denoised * (1 - sigma_to / sigma) - x * (1 - sigma_to / sigma)
  = x + (denoised - x) * (1 - sigma_to / sigma)
  = x + (denoised - x) * (1 - sigma_to / sigma)
  = x + (x - denoised) * (sigma_to / sigma - 1)
  = x + (x - denoised) / sigma * (sigma_to - sigma)
  = x + d * (sigma_to - sigma);
From the DDIM definitions:
alpha_prod_t = 1 / (sigma² + 1)
beta_prod_t  = 1 - alpha_prod_t = sigma² / (sigma² + 1)
d = (x - denoised) / sigma

We have the coefficient of d in the x update:
coeff² = (1 - alpha_prod_t_prev - std_dev_t²) / alpha_prod_t_prev

Where:
std_dev_t² = eta² * variance
variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev)
         = sigma_to² (sigma² - sigma_to²) / (sigma² (sigma_to² + 1))

Substituting variance:
coeff² = ( sigma_to² / (sigma_to² + 1) - eta² * sigma_to² (sigma² - sigma_to²) / (sigma² * (sigma_to² + 1)) ) * (sigma_to² + 1)
        = sigma_to² - eta² * sigma_to² * (sigma² - sigma_to²) / sigma²
        = sigma_to² * ( 1 - eta² * (sigma² - sigma_to²) / sigma² )

From get_ancestral_step:
sigma_down² = sigma_to² - sigma_up²
             = sigma_to² - eta² * sigma_to² * (sigma² - sigma_to²) / sigma²
             = coeff²

So coeff = sigma_down, and the x update becomes:
x = denoised + sigma_down * d
  = denoised + sigma_down * (x - denoised) / sigma
  = denoised + (x - denoised) * sigma_ratio
  = x * sigma_ratio + denoised - denoised * sigma_ratio
  = x * sigma_ratio + denoised * (1 - sigma_ratio)

And the noise coefficient:
noise_coeff = std_dev_t / sqrt(alpha_prod_t_prev)
            = eta * sqrt( sigma_to² * (sigma² - sigma_to²) / (sigma² * (sigma_to² + 1)) ) * sqrt(sigma_to² + 1)
            = eta * sigma_to * sqrt( (sigma² - sigma_to²) / sigma² )
            = sigma_up
It is equivalent to Euler Ancestral with the Simple scheduler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant