Blog

Long form content,my own research notes, and thoughts on vision & ML ·

Latent Scaffolding

A series exploring emergent capabilities hidden inside VL-conditioned single-stream diffusion transformers. No fine-tuning, no additional training — just scaffolding: architectural hacking and careful probing of what these models already know.

Latent Scaffolding: Token Dropout for Diverse Image Variations

Vision-only token dropout solves mode collapse in spliced I2I generation. Along the way: hunting attention sinks, discovering which conditioning tokens are load-bearing, and two orthogonal knobs for controlling diversity.