The Devil in the Details: Engineering Tricks for SOTA Video Models

Theory is clean, but training is messy. This note covers 5 essential engineering tricks—from Timestep Shifting to 3D RoPE—that stabilize training and boost performance.

From DDPM to Flow Matching: The Evolution of Generative Trajectories

A technical note on the shift from noise prediction (DDPM) to velocity prediction (Flow Matching), and how CFG acts as a vector field modifier.

From DiT to Hunyuan: The Evolution of adaLN-Zero in Generative Models

From DiT to Hunyuan Video, adaLN-Zero remains the gold standard for conditioning. Here’s how this zero-initialized module works and why it persists in the era of Flow Matching.

Visualizing 3D Attention: Bridging the Gap Between 1D Sequences and 3D Space

An interactive tool to visualize the mapping between 1D token sequences and 3D (T, H, W) sliding windows.